Skip to content
This repository has been archived by the owner on Mar 28, 2024. It is now read-only.

[BUG-231884] llRequestAgentData(id,DATA_NAME) stops responding after 3000-5000 requests #9259

Closed
1 task
sl-service-account opened this issue Mar 5, 2022 · 7 comments

Comments

@sl-service-account
Copy link

sl-service-account commented Mar 5, 2022

What just happened?

I sell a product that sends inventory to a list of subscribers. Part of its function requires getting the avatar's name from its key (uuid) using llRequestAgentData. Within the past week I've gotten several reports from customers that after making about 3000-5000 requrests, the dataserved stops responding for typically 30-50 consecutive requests. After which it may or may not resume responding for some number of requests before failing again. My script has a 60 second timeout on dataserver responses.

What were you doing when it happened?

The device was processing a long list of avatar keys to get the avatar's name via llRequestAgentData(id,DATA_NAME). After about 3000-5000 requests, the dataserver stops responding. After stopping it for about an hour, was able to resume normal operation for a few hundred more requests, after which it stops responding again. I regulate the request rate to no more than one per second. I also observed that when one sender device encounters this problem, and another sender is rezzed, it almost immediately encouters the same problem, i.e. it appears to be affecting requests from all devices in the region.

What were you expecting to happen instead?

Expected the dataserver to be able to sustain a continuous request stream indefinitely at a rate of one request per second.

Other information

My product (Online Sender) has been operating correctly for several years with over 1000 sold. The current issue started roughly with the rollout of Second Life Server 2022-02-10.568388 and has been resported by several customers in different regions. I have now been able to replicate it in my home region (Distorted). I have also created a very simple script that also replicates the problem to be sure it is not related to the actual Online Sender. See attachment for the replication script.

Attachments

Links

Related

Original Jira Fields
Field Value
Issue BUG-231884
Summary llRequestAgentData(id,DATA_NAME) stops responding after 3000-5000 requests
Type Bug
Priority Unset
Status Closed
Resolution Triaged
Reporter Fred Allandale (fred.allandale)
Created at 2022-03-05T17:30:58Z
Updated at 2022-04-21T21:43:48Z
{
  'Build Id': 'unset',
  'Business Unit': ['Platform'],
  'Date of First Response': '2022-03-14T20:03:10.151-0500',
  "Is there anything you'd like to add?": 'My product (Online Sender) has been operating correctly for several years with over 1000 sold. The current issue started roughly with the rollout of Second Life Server 2022-02-10.568388 and has been resported by several customers in different regions. I have now been able to replicate it in my home region (Distorted). I have also created a very simple script that also replicates the problem to be sure it is not related to the actual Online Sender. I will post this script and sample output in a comment.',
  'ReOpened Count': 0.0,
  'Severity': 'Unset',
  'System': 'SL Simulator',
  'Target Viewer Version': 'viewer-development',
  'What just happened?': "I sell a product that sends inventory to a list of subscribers. Part of its function requires getting the avatar's name from its key (uuid) using llRequestAgentData. Within the past week I've gotten several reports from customers that after making about 2000 requrests, the dataserved stops responding. My script has a 60 second timeout on dataserver responses. ",
  'What were you doing when it happened?': "The device was processing a long list of avatar keys to get the avatar's name via llRequestAgentData(id,DATA_NAME). After about 2000 requests, the dataserver stops responding. After stopping it for about an hour, was able to resume normal operation for a few hundred more requests, after which it stops responding again. I regulate the request rate to no more than one per second. I also observed that when one sender device encounters this problem, and another sender is rezzed, it almost immediately encouters the same problem, i.e. it appears to be affecting requests from all devices in the region.",
  'What were you expecting to happen instead?': 'Expected the dataserver to be able to sustain a continuous request stream indefinitely at a rate of one request per second. ',
  'Where': 'See environment above.\r\nAlso observed/reported in the following regions:\r\nElmira\r\nEmotion\r\nHolly Kai Estates\r\nDistorted\r\nGeum',
}
@sl-service-account
Copy link
Author

Fred Allandale commented at 2022-03-05T18:02:17Z, updated at 2022-03-05T23:54:17Z

The attached script can be used to replicate the issue. Basically it reads a list of avatar keys from notecards and requests the avatar name from the datasever via llRequestAgentData(AvKey,DATA_NAME). It outputs the returned name, the count of keys read, count of names returned, and dataserver response time. It also includes a 60 second timeout in the event of no dataserver response, and goes to the next key.

CAUTION: This script requires several notecards with several thousand unique, valid avatar keys. This problem does not occur until between 2000 and 5000 consecutive requests have been made (approx 1 per second). It does NOT occur if repeated requests are made for the same key. It does NOT occur if the name has been cached to the local simulator previously. It only occurs if the request has to communicate back to the central database.

NOTE: My observation is this behaves very much like throttles that are imposed on certain other LSL functions (e.g. llGiveInventory()), however there has (to my knowledge) never been a request rate throttle on llRequestAgentData(). Even if this is the result of a new, undocumented throttle, my replicatioin stays well below the 2 per second continuous rate that other throttles impose. Its behaviour is also similar to other throttles in that it is applied to all scripted objects with the same owner in the same sim once it triggers, and stopping it for some period time allow it to resume normal operation until it triggers again after some time.

Here is a sample output from the replication script at the point where the dataserver stopped responding:

Note that the problem occurred after 2579 consecutive requests. On two other runs, the problem occurred after 3700 and 3241 requrests, respectively

[21:27] Dataserver Delay test: KeyCount=2573  NameCount=2572  name=Shila Szondi  dataserver reply time=0.866264
[21:27] Dataserver Delay test: KeyCount=2574  NameCount=2573  name=Dex Mason  dataserver reply time=0.867611
[21:27] Dataserver Delay test: KeyCount=2575  NameCount=2574  name=Ona Ra  dataserver reply time=0.866512
[21:27] Dataserver Delay test: KeyCount=2576  NameCount=2575  name=Emo Daddy  dataserver reply time=0.867553
[21:27] Dataserver Delay test: KeyCount=2577  NameCount=2576  name=Candido Ferraris  dataserver reply time=0.865922
[21:27] Dataserver Delay test: KeyCount=2578  NameCount=2577  name=hela Lennie  dataserver reply time=0.867685
[21:27] Dataserver Delay test: KeyCount=2579  NameCount=2578  name=MiChi Organiser  dataserver reply time=0.867060
[21:28] Dataserver Delay test: Datasever timed out waiting for name for 643f398f-3b57-4d19-8bd6-70224c44926c
[21:29] Dataserver Delay test: Datasever timed out waiting for name for 6447ff8e-1096-4b8c-9cef-ac352800fa69
[21:30] Dataserver Delay test: Datasever timed out waiting for name for 64482b4f-31e6-4da5-af76-deb92a2d5575
[21:31] Dataserver Delay test: Datasever timed out waiting for name for 644bc0fd-cfb3-4c2a-9526-ca7d73a7e8f4
[21:32] Dataserver Delay test: Datasever timed out waiting for name for 64595c0d-92cf-4684-8b92-7c0cd8f1b1e8
[21:33] Dataserver Delay test: Datasever timed out waiting for name for 645a7a85-e7af-40cd-a356-23d90e806a79
[21:34] Dataserver Delay test: Datasever timed out waiting for name for 645a97e8-feec-4e71-ba4e-58a2fa7caf53
[21:35] Dataserver Delay test: Datasever timed out waiting for name for 645d22b9-26f8-4d74-a24b-44af4630616b

@sl-service-account
Copy link
Author

Fred Allandale commented at 2022-03-06T20:25:03Z, updated at 2022-03-06T23:47:45Z

Ran testers in Distorted and Geum regions today. Dataserver started continuous timeouts (no response for 60 seconds) at 3218 and 3232 requests, respectively.
I noticed sometimes it spontaneously stops timing out after 20 or more timouts in a row.
Tester in Distorted finished with 6485 keys read, 6398 names found, 87 timeouts
Tester in Geum finished with 6485 keys read, 6274 names found, 211 timeouts

Timeouts only happen with llRequestAgentData(id,DATA_NAME).
Timeouts never happen with llRequestAgentData(id,DATA_ONLINE).

@sl-service-account
Copy link
Author

Maestro Linden commented at 2022-03-15T01:03:10Z, updated at 2022-03-15T16:55:03Z

Sorry for the delay, Fred - we're investigating this issue. I have my own test script running right now with a dataset of 10k agents' names to resolve.

Update: I can confirm that I can reproduce this issue.

@sl-service-account
Copy link
Author

Fred Allandale commented at 2022-03-19T20:41:05Z

Thanks for looking into this. Any progress on a fix?

@sl-service-account
Copy link
Author

Fred Allandale commented at 2022-04-10T17:58:50Z

Reminder. This bug was accepted on March 14, 2022 but remains unassigned. Is anyone going to be working on it in the near future?

@sl-service-account
Copy link
Author

Lucia Nightfire commented at 2022-04-11T01:14:52Z

@fred, region version 2022-04-01.570305 has a fix for this.

It is currently on all RC channel regions.

Please verify if the bug still reproduces on regions running this version.

@sl-service-account
Copy link
Author

Fred Allandale commented at 2022-04-11T20:05:52Z

Confirmed this bug does NOT occur in regions running Second Life Server 2022-04-01.570305.
Also confirmed this bug DOES still occur in regions running Second Life Server 2022-03-24.569934.
So it appears to be fixed in the latest version.
Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant