History | Log In     View a printable version of the current page.  
  • All submissions to this site are governed by Second Life Project Contribution Agreement. By submitting patches and other information using this site, you acknowledge that you've read, understood, and agreed to those terms.
Issue Details (XML | Word | Printable)

Key: SVC-1086
Type: New Feature New Feature
Status: In Progress In Progress
Priority: Major Major
Assignee: kelly linden
Reporter: kelly linden
Votes: 49
Watchers: 29
Operations

If you were logged in you would be able to see more operations.
2. Second Life Service - SVC

LSL http_server

Created: 18/Dec/07 09:49 AM   Updated: Tuesday 02:05 PM
Component/s: XML-RPC, HTTPRequest, Scripts
Affects Version/s: None
Fix Version/s: None

Issue Links:
Duplicate
 
Relates
 

Linden Lab Issue ID: DEV-6226

Sub-Tasks  All   Open   
 Sub-Task Progress: 

 Description  « Hide
(more) Complete design: https://wiki.secondlife.com/wiki/LSL_http_server

Goals

Create an alternative to the XMLRPC server for communication with LSL scripts initiated from outside Second Life that is easy to use and scalable.

LSL

    * llRequestHTTPServerURL()

    An asynchronous event with no return data.
    This will create a capability on the cap server that maps to an internal simulator url.
    An http_event will be triggered with type 'SERVER_URL' and a body of the cap url created.
    If a cap already exists for this object, the existing URL will be passed to the http_event
    One server/url per prim.
    Example public url: https://sim123.agni.lindenlab.com/cap/f23b4b94-012d-44f2-bd0c-16c328321221

    * llClearHTTPServerURL()

    This will clear or invalidate the cap for this lsl http server.
    Calling llRegisterHTTPServer again after this should generate a new cap URL.
    Triggers an http_event with success/fail. (or just always with 'success' if it is not possible to fail)

    * http_server(string method, list meta, string body)

    Event triggered when an URL is hit.

        * method is GET/POST/PUT/DELETE
        * meta is a list of meta data about the request.
              o Initially this is only REQUESTING_HOST which is the IP of the request. This can be extended later as needed.
        * body is the body of the request

    * http_event(integer type, string body)

    Triggered for specific events relating to the HTTP server

        * SERVER_URL: body will be the cap url that maps to this scripts http server
        * URL_LOST: no body, triggered whenever a cap is lost or cleared
              o urls will be lost if the object changes regions or the region restarts



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Gordon Wendt - 18/Dec/07 10:02 AM
Wow, never thought that this would be critical however considering the fact that the resource crunch grows by the day as more and more scripts are straining the system it definitely deserves it.

Kelly, does the fact that this is listed as critical mean that it's a fairly high priority on LL's plate to get it implemented ASAP? and if so when might we see it at least in beta form?

Lex Neva - 18/Dec/07 10:08 AM
Will HTTPS be allowed, as it is in llHTTPRequest()? If so, can we have a setting to allow only HTTPS requests? Or would we need to filter based on some metadata about whether SSL was used?

I could see this being difficult because of the need to provision many SSL certificates, one for each region or sim. Perhaps LL could solve this by creating their own CA, and posting its CA certificate on the SL website accessible by HTTPS so we can be sure that we're getting an authentic certificate.

Then again, perhaps that's all a little too complicated ;)

Drew Dwi - 18/Dec/07 10:10 AM
issue would like to raise >

o urls will be lost if the object changes regions or the region restarts

this will create some problems as objects are not always aware of a region restart.

Lex Neva - 18/Dec/07 10:13 AM
That's already dealt with, it seems. The script will get an http_event() with type URL_LOST when the http server is lost. Hopefully we can get some amount of guarantee that that event will always be sent to the script when the server is lost, so that the script can restart the server if it wants to.

kelly linden - 18/Dec/07 10:14 AM
Gordon: That is my personal priority setting. We actually don't pay too much attention to the priority of issues in pjira - votes and number comments (ie general interest) weigh in more. The internal one is 'major' so maybe I'll change this to match. I'm hoping to get to work on this within the next 6 months. I have other things I must do first though, priorities being what they are.

Lex: That is a good point. I actually think the cap server only listens on https, but I will look into that closer. My guess right now is it will be either HTTPS only or both.


Lex Neva - 18/Dec/07 10:18 AM
Whoa, hmm. It might be disadvantageous if ONLY HTTPS is allowed... because then people will have to dig around for an HTTPS-compliant library for their given language. Granted, many languages have them, but HTTP is likely to be more supported than HTTPS. I'm just idly nitpicking, though... I'd sure rather see HTTPS-only webservers in SL than nothing ;)

Drew Dwi - 18/Dec/07 10:20 AM
@lex:

2 objects communicating via LSL http requests (assuming this is permitted: see http://wiki.secondlife.com/wiki/LSL_http_server).

rolling restart occurs, both objects now have new url's having to revert back to llemail or xmlrpc to communicate the new url's once generated?

the issue is ability to keep track of the url of the object without having to fall back on another communication method is point i'm trying to raise if that makes sense.

kelly linden - 18/Dec/07 10:31 AM
Drew: For communication between 2 objects on the SL grid as you describe llEmail will work great as a fall back for discovering urls. llEmail from one object on the grid to another is not really email and is much faster than normal email. The primary goal of this feature however is not obj -> obj comms. Email really is very good for that. The biggest goal is for communication to in world object initiated from outside SL.

You are right, this design has the draw back that any sort of presence for objects has to be maintained by the LSL scripter. I am aware of that. I think it is a necessary evil to create the most scalable feature possible and I have tried to add the features to make it as easy as possible. This aspect is actually mirrored by XMLRPC which requires you to track a key that may change (although less frequently than http_server will).

Gigs Taggart - 19/Dec/07 12:46 AM
Kelly,

We need a way to get the GET variables. Thanks for this.

kelly linden - 19/Dec/07 10:16 AM
Gigs,

The wiki page has some more details I think. But basically I don't think that will be possible at first. The caps system just doesn't allow for anything past the end of the cap URL (no /<uuid>/Foo). You will be able to do POST or PUT to pass in a body with the message (or even with a GET I think?), which the LSL can look at and send back different data for. Sure it isn't the easiest, but I think it is workable.

If in the future we are able to extend the cap system to allow tacking data to the end of urls then the LSL event will be able to handle that as extra information in the META variable of the http_server event.

Lex Neva - 19/Dec/07 10:23 AM
Hmm... without being able to pass GET variables, we won't be able to easily set up a redirecting registry of LSL server URLs. I was envisioning some kind of service on a server out on the internet, which would allow LSL scripts to register their HTTP servers using an llHTTPRequest(). The script could request a simple, descriptive name, and the server could provide a static URL that wouldnt' change as the script's HTTP server URL changed. Requests to that static URL would be met with an HTTP redirect that sent the requester on to the LSL script. Redirects won't send POST bodies, though, so that kind of system would break down unless the request was actually proxied. Ah well.

kelly linden - 19/Dec/07 10:27 AM
Really? It doesn't seem like much of a redirect if it won't forward the post body. I admit to not being an expert on that though and could see that would be a draw back.

The cap server is essentially an url redirect service. The data from the request at the cap url is sent to the internal url - including any data in POST or PUT (and I think even bodies of GET). Or is that more what you mean by "actually proxied"?

Lex Neva - 19/Dec/07 10:58 AM
What you describe is a proxy, rather than a redirect, I think. The caps server itself goes inside and does its own requests on behalf of the client, and then forwards the data back to the client. An HTTP redirect (HTTP 302 status, for example) actually tells the client "go here instead", and the client is responsible for following up on the request. That's advantageous for a registry like I described above because it would mean there was no burden on the server of actualy fulfilling the request and forwarding the results. The original client would deal with that. That would mean the server would use fewer resources, and the registry system would act similar to a DNS lookup, but at the HTTP level.

Redirects don't forward the POST body. In fact, if your browser POSTs a form and gets back a redirect, it GETs the url specified in the "Location:" header. This is probably a good idea so that POST bodies don't get spewed all over the place. It's fairly common (especially in forums and the like) to use a POST -> Redirect -> GET system for submitting forms. The user fills out the form (ie me placing this comment), the browser POSTS the form body, the server acts on the POST, and then the server redirects the client to use a GET request to see the results of their action. This means that the user ends up, for example, just looking at a JIRA entry, and if they hit "reload", their comment won't re-POST. Generally a good thing.

From the HTTP RFC:

10.3.3 302 Found

   The requested resource resides temporarily under a different URI.
   Since the redirection might be altered on occasion, the client SHOULD
   continue to use the Request-URI for future requests. This response
   is only cacheable if indicated by a Cache-Control or Expires header
   field.

   The temporary URI SHOULD be given by the Location field in the
   response. Unless the request method was HEAD, the entity of the
   response SHOULD contain a short hypertext note with a hyperlink to
   the new URI(s).

   If the 302 status code is received in response to a request other
   than GET or HEAD, the user agent MUST NOT automatically redirect the
   request unless it can be confirmed by the user, since this might
   change the conditions under which the request was issued.

      Note: RFC 1945 and RFC 2068 specify that the client is not allowed
      to change the method on the redirected request. However, most
      existing user agent implementations treat 302 as if it were a 303
      response, performing a GET on the Location field-value regardless
      of the original request method. The status codes 303 and 307 have
      been added for servers that wish to make unambiguously clear which
      kind of reaction is expected of the client.

Interestingly enough, the common convention of automatically following a redirect resulting from a POST violates the spec... but it's generally considered a useful way to go about things, and EVERY browser does it. "303 See Other" basically does exactly what I described above, and the spec specifically says that the target URL SHOULD be retrieved with a GET.

All in all, HTTP redirects simply don't carry a POST body onward, so any data would need to be copied by the server from the original request's query string into the redirect-target's query string. So that means that something like I described above wouldn't be possible without GET argument processing. Not the end of the world, but something to consider.

kelly linden - 19/Dec/07 11:06 AM
Thanks Lex, that is good information.

Lex Neva - 19/Dec/07 11:08 AM
Glad I could help. I learned something, too!

Gigs Taggart - 20/Dec/07 02:39 AM
There's a deeper issue here, that the new CAPS stuff purports to be RESTful, but one of the key things about REST is that the URI can fully describe the resource to be accessed.

I'd go as far as saying that's THE core idea of REST.

Unless I'm misunderstanding this situation badly, this whole thing goes strongly against the grain of REST the way it's designed now, forcing actual resource information into the body of the request.

I'm not saying this wouldn't be extremely useful even if we don't have a GET sort of functionality, but there might be a deeper issue here that needs to be resolved.

cory bjornson - 02/Jan/08 05:32 PM - edited
Does that mean we can host a file, page and view it from a external browser?

Also, an ETA, if you would be so kind?

- Cory

kelly linden - 16/Jan/08 01:39 PM
Cory:
Not really. The url is going to ugly and subject to semi frequently changing. You could have a script that returned some specific data, or data about it's environment or whatever, and you could view that data in a web browser. However given the limited available script memory, general LSL constraints and the ugly, changing urls it will not be good for use as a general web server. And no, you won't be able to host files on it (although someone will figure out the exact LSL required to serve some small file I'm sure.

No ETA right now. This is not in active development.

SiRiS Asturias - 21/Jan/08 11:24 AM
Updated the Feature Request pages on the wiki: Added this JIRA to notes section.

http://wiki.secondlife.com/wiki/Http_server

* Feature Request function set needs to be updated to reflect any new changes/info.


Lex Neva - 21/Jan/08 12:04 PM
Siris... I looked at the functions list at the bottom of that page, and I see a function "llGetHTTPServerURL". I don't understand where it came from and why it needs a 0.5 second delay. It'd be useful to have in case you misplaced the return from http_server(), but why a delay?

SiRiS Asturias - 21/Jan/08 07:33 PM
Lex:

At the time of conception that was the set we (Sean & others...) had come up with initially. Nothing about them is set in stone & totally theoretical. The delay is just there for "shits n' giggles" basically.

Please feel free to update/delete/rename them, etc... A lot has changed since they were made originally.

Lex Neva - 22/Jan/08 03:26 PM
Well, if it's just for shits and giggles, I think I'll remove the delay, because delays are annoying. If LL needs it, they'll add it in.

cory bjornson - 23/Jan/08 07:04 PM
I need this >_>

I need to be finished <.<

Kamilion Schnook - 24/Jan/08 01:46 AM
How do we get URL parameters?
https://sim123.agni.lindenlab.com/cap/f23b4b94-012d-44f2-bd0c-16c328321221?var1=true&var2=false&var3=foo&var4=bar

What method would I use to get varX in the LSL script from a GET request to the above?

SiRiS Asturias - 24/Jan/08 03:07 AM
Using llListFindList... Hopefully the params should be received as a Name/Value pair within a strided list:

Evens - Param name
Odds - Param value

[param1, value1, param2, value2, . . . paramN, valueN]

http://wiki.secondlife.com/wiki/Http_request

kelly linden - 24/Jan/08 08:31 AM
Kamilion: For the design proposed in this jira you may need to use a PUT or a POST, in which case the variables would be in the body of the message. I'd like to be able to pass the tail end of the url in to the LSL script but there are reasons this may not be possible. If we can do it everything after the capability would be passed in as a single string in the meta list of the http_server event.

And yes, that is I think the biggest down side to the proposed design here.

Lex Neva - 24/Jan/08 08:55 AM
Hmm, if that's how it's done (passing in the query string url-escaped), then it might be nice to have a llDecodeQueryString() function. Heck, it might be nice to have one regardless, for POST bodies encoded in the same way (as happens with forms submitted with method=POST). It'd be possible to write an equivalent in LSL, but it'd be slow and use a lot of memory.

kelly linden - 13/Jun/08 11:37 AM
The wiki design for this has been updated:
https://wiki.secondlife.com/wiki/LSL_http_server

There is a teaser image of a hello world server available also:
http://wiki.secondlife.com/wiki/Image:Lsl_http_server.JPG

I don't currently have a time frame other than Soon(tm).

Feedback is welcome, but please keep comments on the discussion page of the wiki, the way I am updating the design page (to reflect a changing internal design page) means edits there are likely to be lost. Final word of warning: while this is in active development, the design is being adjusted regularly to reflect actual implementation when needed.

Lex Neva - 13/Jun/08 12:27 PM
I'm excited :D


> * system_changed(integer change) (possibly a subset of existing changed event)

Maybe this should be a separate event. Having a changed() event means that scripts get events for ANY change and have to filter. That could annoyingly clog an event queue when all you want to do is keep your http server open. It could increase simulator load.

kelly linden - 13/Jun/08 01:51 PM
While that is true Lex, I think the additional script load in that case is negligible, but probably something to be considered.

The other concerns I have around the issue are:
* Events are a relatively scarce resource while trying to maintain LSL as a language. This should indicate we should use changed() if possible.
* Some (IMHO poorly written) scripts only ever expect specific changes and do not verify the changed event is for the event they expect. I've seen problems similar to this when adding a flag to the llGetAgentInfo results that caused some AO scripts trouble. For example because there were so few returned flags you rarely got more than 1 and a straight == comparison (usually) worked, when more were added that was no longer necessarily true. This is what makes it tricky and what really needs to be thought about and investigated - and may mean a new event is needed.

Lex Neva - 13/Jun/08 02:28 PM
Ah, it's good to hear that the load from a changed() event is low. In that case, changed() does sound like a good way of doing this. Any script using == on a bitfield deserves to be broken. It's one thing to avoid breaking content by changing the LSL API, and it's another thing to avoid breaking content by fixing LSL bugs people depend on, but it may be too far to try to avoid breaking LSL due to incorrect LSL code.

I wonder what other CHANGE_ flags the SYSTEM_CAPS_RESTART and SYSTEM_REGION_RESTART flags would be likely to occur in combination with?

Cenji Neutra - 13/Jun/08 05:44 PM
Good stuff Kelly.

Q: How are DoS attacks against sims handled?
While an attacker can't guess the public URL since it contains a random UUID part (the cap key), wouldn't they still be able to bring a sim to is knees by just flooding HTTP requests with random (non-cap) requests? I realize the requests will be rejected by the server as invalid, but it still have to respond to them all and that'll cause legitimate incoming requests to be crowded out, no?
It there any sort of source-IP address based throttling of the incoming requests handled upstream from the sim in the network path? That might not stop attacks that use bot nets or IP spoofing, but it'll stop the naive attacker who aims to bring the sim of someone they don't like down using a quick shell script or whatever (which it likely to be the most common case in my estimation).
-Cenji.

Kamilion Schnook - 15/Jul/08 02:34 PM
Good to see that work has started on this.

Things I Like about the current design:

        * 'x-forwarded-for' and 'x-forwarded-for-port': for the ip/port of the requester.
        * 'x-trusted-path': A string specified by the user at url request
        * 'x-untrusted-argument': A header appended by the cap server containing any path and/or arguments appended to the cap

I'm glad you took my request for arguments into consideration, and I heartily approve the header name -- "x-untrusted-argument" is a perfect description of what I was looking for.
The trusted path is also good, as I'm assuming if one of my scripts obtains two public URLs, that will be the metadata I will have to check to differentiate between the two public URLs.

        * This is a first use for a more general Limited Script Resource system that should eventually also handle script memory and cpu cycles.
        * Not all requests for an url will succeed, the scripter is expected to handle the failure case.
        * The number of available urls will be based on the amount of land owned in the region
        * integer llGetFreePublicURLs() returns how many public URLs are available.

I like the fact that you guys are really thinking about the limits to put in place to make this a useful AND safe thing -- otherwise we'll end up with "the bling effect" of some jerk putting ~200 particle bling scripts in his 3 prim sculptie necklace. Very good forethought there.

    * Define response codes - no script/object found, throttled.


Yet more forethought into the response codes, I would propose that response code 410 "Gone" be used for "No Script/Object found".

410 Gone

The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.

The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.


301 "Moved Permanently" should be used for an object that has recently changed regions, if the regions are still in communication with eachother (assuming they are adjacent) a 301 should be returned to the new CAPS URL if known.


Throttled requests should result in a 503 "Service Unavailable" with a proper Retry-After header.

503 Service Unavailable

The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.

      Note: The existence of the 503 status code does not imply that a
      server must use it when becoming overloaded. Some servers may wish
      to simply refuse the connection.


    * Define mime-handling

    I believe that we should, like the llHTTPRequest() call, be very clear in our handling of bodies and mime-types.
    In particular, accept only text/* mime types, and be sure to do proper charset handling and conversion into the