Skip to content
This repository has been archived by the owner on Feb 28, 2024. It is now read-only.

[BUG-40565] Significant script performance degradation observed in server version 16.09.14.319788 #12225

Open
sl-service-account opened this issue Sep 23, 2016 · 9 comments

Comments

@sl-service-account
Copy link

Steps to Reproduce

My Online Sender product, which contains a large number of database scripts, is taking an order of magnitude longer to initialize, resulting in a database timeout error. This only started occurring on 9/22/2016 after my region was updated to Second Life RC BlueSteel 16.09.14.319788. I have confirmed the same thing is happening in all other RC channel servers running this version that I have tested in. I have also confirmed that it is NOT happening in any main channels running server version 16.08.31.319299.

Actual Behavior

Several of my products started experiencing significant script slowdowns (roughly 10 times slower) on RC server version 16.09.14.319788, causing various timeouts. This slowdown is NOT observed on main channel server version 16.08.31.319299.

Expected Behavior

I expect the newer version server to execute scripts at approximately the same speed as previous version. Not 10 times slower.

Other information

I have created test objects and scripts to reproduce the problem. One simply executes a simple loop and chats the time it took to execute the loop n times. The other reads a 100 line notecard and chats the time it took to read each line, the total time and the average time per line. The slowdown is clearly evident in both, ranging from 7 to 15 times slower in server version 16.09.14.319788 versus 16.08.31.319299. The problem only occurs only when both of the following are true.

  1. The script is running on sim server version 16.09.14.319788.
  2. The object containing the test script also contains at least 4 other scripts. These other scripts can be "do-nothing" scripts, i.e. scripts which have no executable lines of code, or are the default "new script".
  3. The 4 or more other scripts must be identical.

It was noted that the test script runs at normal speed with up to 3 "do nothing" scripts present. When a 4th "do nothing" script is added, the slowdown happens. Adding additional "do nothing" scripts does not result in additional slowing. There appears to be a step change in execution speed when going from 4 to 5 total scripts in the object.

Original Jira Fields
Field Value
Issue BUG-40565
Summary Significant script performance degradation observed in server version 16.09.14.319788
Type Bug
Priority Unset
Status Been Triaged
Resolution Triaged
Reporter Fred Allandale (fred.allandale)
Created at 2016-09-23T17:55:45Z
Updated at 2016-09-26T17:26:17Z
{
  'Business Unit': ['Platform'],
  'Date of First Response': '2016-09-23T15:08:20.891-0500',
  "Is there anything you'd like to add?": 'I have created test objects and scripts to reproduce the problem. One simply executes a simple loop and chats the time it took to execute the loop n times. The other reads a 100 line notecard and chats the time it took to read each line, the total time and the average time per line. The slowdown is clearly evident in both, ranging from 7 to 15 times slower in server version 16.09.14.319788 versus 16.08.31.319299.  The problem only occurs only when both of the following are true.\r\n\r\n1. The script is running on sim server version 16.09.14.319788.\r\n2. The object containing the test script also contains at least 4 other scripts. These other scripts can be "do-nothing" scripts, i.e. scripts which have no executable lines of code, or are the default "new script".\r\n\r\nIt was noted that the test script runs at normal speed with up to 3 "do nothing" scripts present. When a 4th "do nothing" script is added, the slowdown happens. Adding additional "do nothing" scripts does not result in additional slowing. There appears to be a step change in execution speed when going from 4 to 5 total scripts in the object. ',
  'ReOpened Count': 0.0,
  'Severity': 'Unset',
  'System': 'SL Simulator',
  'Target Viewer Version': 'viewer-development',
  'What just happened?': 'Several of my products started experiencing significant script slowdowns (roughly 10 times slower) on RC server version 16.09.14.319788, causing various timeouts. This slowdown is NOT observed on main channel server version 16.08.31.319299.',
  'What were you doing when it happened?': 'My Online Sender product, which contains a large number of database scripts, is taking an order of magnitude longer to initialize, resulting in a database timeout error. This only started occurring on 9/22/2016 after my region was updated to Second Life RC BlueSteel 16.09.14.319788. I have confirmed the same thing is happening in all other RC channel servers running this version that I have tested in. I have also confirmed that it is NOT happening in any main channels running server version 16.08.31.319299.',
  'What were you expecting to happen instead?': 'I expect the newer version server to execute scripts at approximately the same speed as previous version. Not 10 times slower.',
  'Where': 'See Environment.',
}
@sl-service-account
Copy link
Author

Fred Allandale commented at 2016-09-23T18:11:26Z, updated at 2016-09-25T00:41:18Z

Here is a script that will repro the problem. Follow these steps:

  1. Go to a region running server version 16.09.14.319788 (e.g. BlueSteel, LeTigre or Magnum).
  2. Put the test script below in a prim.
  3. Touch the prim and wait. It should show a execution time of about 4-6 seconds. This is the normal speed.
  4. Add 3 identical do-nothing scripts such as the default "new script" or a script with no executable code.
  5. Repeat step 3 and observe normal execution speed.
  6. Add 1 more identical do-nothing script (1 test script and 4 identical do-nothing scripts).
  7. Touch the script and wait. It should show a much longer execution time, e.g. about 60-80 seconds.
  8. Add several more identical do-nothing scripts. Touch the prim and observe no additional degradation in execution time.
  9. Go to a main channel regions running server version 16.08.31.319299.
  10. Touch the prim and wait. Observe normal execution speed (about 4-6 seconds) regardless of the number of identical do-nothing scripts.

Test Script:

default
{
state_entry()
{

}
touch_start(integer total_number)
{
    integer n = 100000;
    integer i;
    float starttime;
    float endtime;
    string txt;
    starttime = llGetTimeOfDay();
    llOwnerSay("Starting [n=" + (string)n + "]");
    for(i=0;i<n;i++)
    {
        txt = (string)i;
    }
    endtime = llGetTimeOfDay();
    llOwnerSay("Execution time for n=" + (string)n + " was " + (string)(endtime - starttime) + " seconds.");
}

}

Do-nothing script:

default
{
state_entry()
{

}

}

@sl-service-account
Copy link
Author

Fred Allandale commented at 2016-09-23T18:21:17Z

Here is another repro script. This one reads notecard lines. Put this script in a prim along with a notecard named "Test NC" that has about 100 lines. Then follow the repro steps in my previous comment. The script will chat the execution time for each notecard line read as well as the total read time and average per line time.

integer Line;
string NCName = "Test NC";
key ReqID;
float NCStartTime;
float NCFinishTime;
float LineStartTime;

default
{
state_entry()
{

}

touch_start(integer total_number)
{
    Line = 0;
    NCStartTime = llGetTimeOfDay();
    LineStartTime = NCStartTime;
    ReqID = llGetNotecardLine(NCName,Line);
}

dataserver(key req, string data)
{
    if(req != ReqID) return;
    if(data != EOF)
    {
       llOwnerSay("Line " + (string)Line + " read time was " + (string)(llGetTimeOfDay() - LineStartTime));
       Line++;
       LineStartTime = llGetTimeOfDay(); 
       ReqID = llGetNotecardLine(NCName,Line);
     }
    else
    {
        NCFinishTime = llGetTimeOfDay();
        llOwnerSay("Notecard with " + (string)Line + " lines was read in " + (string)(NCFinishTime - NCStartTime) + " seconds.");
        llOwnerSay("Average notecard read time was " + (string)((NCFinishTime - NCStartTime)/Line) + " seconds.");
    }
}

}

@sl-service-account
Copy link
Author

Caleb Linden commented at 2016-09-23T20:08:21Z, updated at 2016-09-23T20:08:31Z

Thanks for letting us know, we are investigating it momentarily

@sl-service-account
Copy link
Author

Vix Linden commented at 2016-09-24T03:48:58Z

The region [Mali Bay] fits the description of this issue. Scripts Run very low, script time nearly at zero considering there are over 6,000 active scripts. My own troubleshooting scripts could not rez.

@sl-service-account
Copy link
Author

Kyle Linden commented at 2016-09-24T04:09:13Z

Set 'Needs More Info' to gather more reports.

@sl-service-account
Copy link
Author

Darien Caldwell commented at 2016-09-24T20:05:21Z, updated at 2016-09-24T20:10:48Z

I tried this two ways in RC LeTigre 16.09.14.319788.

First attempt at Repro:

[12:53:34] Object: Starting [n=100000]
[12:53:41] Object: Execution time for n=100000 was 6.518555 seconds.
[12:53:47] Object: Starting [n=100000]
[12:53:53] Object: Execution time for n=100000 was 6.000244 seconds.
[12:53:55] Object: Starting [n=100000]
[12:54:01] Object: Execution time for n=100000 was 5.820313 seconds.

[12:54:05] Object: Hello, Avatar!
[12:54:24] Object: Hello, Avatar! [ Compiled 3 unique scripts with do-nothing code]
[12:54:40] Object: Hello, Avatar!

[12:55:03] Object: Starting [n=100000]
[12:55:09] Object: Execution time for n=100000 was 6.064941 seconds.
[12:55:11] Object: Starting [n=100000]
[12:55:17] Object: Execution time for n=100000 was 6.647461 seconds.
[12:55:19] Object: Starting [n=100000]
[12:55:25] Object: Execution time for n=100000 was 6.872803 seconds.

[12:55:31] Object: Hello, Avatar! [Compiled a 4th unique script with do-nothing code]

[12:55:52] Object: Starting [n=100000]
[12:55:58] Object: Execution time for n=100000 was 6.133545 seconds.
[12:55:59] Object: Starting [n=100000]
[12:56:05] Object: Execution time for n=100000 was 5.625244 seconds.
[12:56:06] Object: Starting [n=100000]
[12:56:12] Object: Execution time for n=100000 was 5.978271 seconds.

No Degradation observed.

Second attempt at Repro:

[12:57:02] Object: Starting [n=100000]
[12:57:08] Object: Execution time for n=100000 was 6.514648 seconds.
[12:57:10] Object: Starting [n=100000]
[12:57:16] Object: Execution time for n=100000 was 6.044189 seconds.
[12:57:17] Object: Starting [n=100000]
[12:57:23] Object: Execution time for n=100000 was 5.845947 seconds.

[ Dropped in 3 copies of the same 'do-nothing' script asset.]

[12:57:41] Object: Starting [n=100000]
[12:57:47] Object: Execution time for n=100000 was 6.221191 seconds.
[12:57:48] Object: Starting [n=100000]
[12:57:54] Object: Execution time for n=100000 was 6.066406 seconds.
[12:57:55] Object: Starting [n=100000]
[12:58:01] Object: Execution time for n=100000 was 6.223145 seconds.

[ Dropped in 4th copy of the same 'do-nothing' script asset]

[12:58:10] Object: Starting [n=100000]
[12:59:33] Object: Execution time for n=100000 was 83.017330 seconds.

So it seems the issue may be a factor of "Mono bytecode sharing". (outlined here under "mono tips": http://wiki.secondlife.com/wiki/Mono#Mono_benefits

@sl-service-account
Copy link
Author

Fred Allandale commented at 2016-09-25T00:47:28Z

Darien's observation is correct. The additional scripts must be identical copies to cause the problem to occur.

@sl-service-account
Copy link
Author

Lucia Nightfire commented at 2016-09-25T02:21:31Z

The focus seems to be 4 or more of the same script UUID in the same link, whether they be Mono or LSO, running or not.

There is also significant delays with function use with loops, jumps, timers or timed functions.

@sl-service-account
Copy link
Author

Kyle Linden commented at 2016-09-26T17:26:17Z

Thank you all for the information. We have what we need for a fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant