• All submissions to this site are governed by Second Life Project Contribution Agreement. By submitting patches and other information using this site, you acknowledge that you have read, understood, and agreed to those terms.
Issue Details (XML | Word | Printable)

Key: SVC-1405
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Phoenix Linden
Reporter: Andy Enfield
Votes: 5
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
2. Second Life Service - SVC

llEscapeURL fails to escape non alphanumeric characters properly

Created: 01/Feb/08 08:28 AM   Updated: 02/Feb/08 08:23 PM
Return to search
Issue 3582 of 4962 issue(s)
<< Previous | SVC-1405 | Next >>
Component/s: None
Affects Version/s: 1.18.6 Server
Fix Version/s: 1.18.6 Server

Linden Lab Issue ID: DEV-9921
Linden Lab Internal Branch: Branch_1-19-0-Server


 Description  « Hide
Since the last rolling update, llEscapeURL("<string>") fails to properly escape the result.

Example:

llSay(0, llEscapeURL("Hello & goodbye"));

– should produce "Hello%20%26%20goodbye"

– Actually produces "Hello%20&%20goodbye"

Further testing seems to suggest that llEscapeURL() is escaping spaces only.

This potentially breaks a vast array of scripts that use GET to send data out of SL using llHTTPRequest()



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Lex Neva added a comment - 01/Feb/08 10:39 AM
Augh, that's bad! Is this a new thing? Or has it always been like this?

Phoenix Linden added a comment - 01/Feb/08 11:16 AM
The set of characters to not escape was chosen to match HTTP urls and general guidelines for naming resources. Passing in a raw url does not produce well defined results because you really need to know which segments are path parts because path parts are supposed to be escaped individually. The set chosen:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
0123456789
-._~
:@!$'()*+,=/?&#;

This API is basically broken because it does not allow you to specify significant path characters. For example, if the filename actually contained a /, then you cannot use this function to generate the serialized url for that resource.

The only way to actually implement this correctly would be to provide an API like:

generate_url(protocol, authority, host, port, path, query)

where path is a list of path parts and query is a container which supports association.

For example, if you are generating an url with a query string, you probably do not expect to escape to:

http://path/to/resourse%3fname%3Dvalue%26q%3Dtrue


Phoenix Linden added a comment - 01/Feb/08 11:44 AM
We can change this to a different set of characters, eg, escape everything other than alpha-numeric.

I see that curl uses this scheme – I can roll in that change in quickly.


Andy Enfield added a comment - 01/Feb/08 01:09 PM
Lex, it's a new issue -> was quite nicely sending llEscape-d strings offworld via httprequest the day before the rolling update; now it's failing.

Phoenix, that set of characters escaped beautifully before, so it's really a new issue, the API broke this week.


Andy Enfield added a comment - 01/Feb/08 01:13 PM
And a quick fix would be much appreciated — many thanks if you can.

Basically, as of mid last week (the last time I checked), it was perfectly possible to do something like ...

string url = "http://www.mywebsite.com/feedback.php?NAME=" + llEscapeURL(name) ...

and if name was set to e.g. "black & white", then it would escape and pass out perfectly.


Phoenix Linden added a comment - 01/Feb/08 01:18 PM
Code already fixed internally. It's in QA. I will resolve this when it passes QA. We'll ship this ASAP.

Phoenix Linden added a comment - 01/Feb/08 01:34 PM
The wiki document and expected behavior have been updated to be more exact to reflect the version in testing.

Lex Neva added a comment - 01/Feb/08 01:44 PM
So just to be clear here... something changed that wasn't supposed to change, and you've changed it back, right Phoenix?

As to your first comment, I do understand the predicament. It makes sense that you chose not to escape characters that are legal in URLs... but I think you see why we're in a pretty tough place. I, for example, use llEscapeURL() to escape things that I'm passing as query string argument values... ie

string url = "http://www.myserver.com/something.php?argname=" + llEscapeURL(somestring);

I expect that to work akin to a mysql_escape_string type thing, where I'm making data "safe". If ampersands make it through llEscapeURL() unscathed, then I need to implement my own url-escaping function... and in LSL, since we don't have a text search-and-replace function, much less regular expressions, this is a severe pain in the ass.

One more argument for escaping everything non-alphanumeric: It's easy to work around the case that you didn't want things like &=# etc to be escaped. Just don't pass the entire URL to llEscapeURL() (who would do that anyway?); instead build the URL part-wise like I showed above, passing each path segment in and escaping it individually. On the other hand, it's hard to work around a llEscapeURL() that doesn't escape &=#, because you'd essentially have to write your own escaping function. Bleh.

Sorry if this is a moot point.


Lex Neva added a comment - 01/Feb/08 01:46 PM
More clarification needed: I see you removed the notes on the wiki about the 255 character limit that llEscapeURL() seems to have. Is that because that limit was fixed?

enus linden added a comment - 01/Feb/08 05:27 PM - edited
fix is actually in 1.19.0 server, but that value isn't available in the Fix version menu yet.

is working again, hooray for URL escaped strings!


Keiko Rau added a comment - 01/Feb/08 06:31 PM - edited
Ive just spent the last 7 hours debugging this. Like Lex (I think it was - cant see previous comments right now) I use llEscapeURL to clean up POST data that Im sending to a web site. This is definately NOT fixed in 1.19.0 server.so ive reopened the issue. My problems began with the change to 1.19.0

Right now Im testing in Woodbridge, which happens to run 1.18.6 with the havoc beta, and is right beside a 1.19.0 server. Jumping across the border, scripts that work in 1.18.6 (and have worked fine for months now) fail in 1.19.0.

Specifically, 1.19.0 does not encode these characters (and perhaps others)
!#$&'()*+,-./:;=?@_~=

Whereas 1.18.6 and earlier servers have been escaping these for months as so
%21%23%24%26%27%28%29%2A%2B%2C%2D%2E%2F%3A%3B%3D%3F%40%5F%7E%3D

This issue is not clear as to how this has been fixed. Please clarify how it has been fixed, and when the fix will be rolled out to the grid, because im testing in 1.19.0 right now, and its still very broken.

Edit: Appologies if being resolved meant that its in QA, but I saw Enus's comment that its in 1.19.0 before I noticed Phoenix's earlier one. Also to clarify what im doing, Im using the base64 functions to encrypt some of my post data, so I need to escape the = found in the base64 strings, or it confuses things. Im also only encoding the data parts of the string, so & is passed through without escaping, as is the = in the key=value pair, but I need to escape the = in the base64 data.


Dedric Mauriac added a comment - 01/Feb/08 06:37 PM - edited
Please revert the changes in this "fixed" version back ASAP. You've broken a ton of scripts. If you are going to "fix" this for full URL paths, then please create a new method such as llEscapeURLPath(string url). There is no "horray for URL escaped strings" as these strings are no longer fully escaped.

I've always used this to escape field values in my forms for HTTP Post Data.

llEscapeURL(fieldName) + "=" + llEscapeURL(fieldValue);

Please make this work for ! # $ & ' ( ) * + , - . / : ; = ? @ _ ~


Strife Onizuka added a comment - 01/Feb/08 09:23 PM
The documentation for this code has been for a long time that it escapes all characters except alpha-numerical ones. Changing this behavior isn't a good idea. It would be much better to introduce a new function.

Keiko Rau added a comment - 01/Feb/08 09:58 PM
Thanks Guys.
Ive tested this and can confirm after the latest rolling restart that 1.19.0 is now escaping these characters as the previous servers did.

Phoenix Linden added a comment - 02/Feb/08 08:07 PM - edited
I admit this was a failure on my part to understand the function. I changed the implementation to only be bound by script memory space instead of 255 bytes. The old implementation was based on a function in another library that did not document it's behavior. I checked the wiki and it also was unclear on expected input and output – something along the lines of "spaces turn into %20, etc." I have since corrected the documentation to be more precise.

QA has been adding lsl conformance tests to catch this sort of thing before a release happens.

http://wiki.secondlife.com/wiki/LSL_Test_Harness
http://wiki.secondlife.com/wiki/Category:Conformance_Test


Lex Neva added a comment - 02/Feb/08 08:23 PM
Perhaps the somewhat ambiguous name of the function added to your confusion. I'm really glad to hear about conformance tests... that way the onus is not so much on LL developers to know every little tidbit about LSL's functionality to avoid regressions.