|
|
|
Forgot to say, this is in Linux, using OpenJPEG.
This is has been an issue with OpenJPEG since the very beginning! You only see it if you encode with OpenJPEG and then decode with OpenJPEG. You don't see it if you decode with KDU.
It looks like someone switchd whatever it is on the backend that encode the map images to using OpenJPEG! With broken encoder settings. Please see Here's a screenshot. Notice the distinct red/blue color fringing on the low res layer. This may be a bug. And it really should be enabling the MCT to convert the channels to YUV before encoding.
I tried using the patch for
Okay, so here's what I know:
The bug seems to disappear and re-appear for reasons I've been trying to nail down for months and have got nowhere. The bug manifests itself differently on different architectures. On x86_64, avatar texture baking always seems to fail completely, resulting in your avatar turning grey. Every time, without fail. "Grey avatar" never happens on i386. It seems to be related to the texture size. Smaller textures, say 32x32 or 64x64, not sure which, seem to not be effected. Possibly related is the fact that you only get the yellow/blue problem once the final, full resolution image is decoded, never before. This is definitely OpenJPEG related as was noted on the mailing list.
I have seen this when uploading sculpt textures, where 3/4th is red and it's in the corner quadrant. It may also be related to the red/green baking artifact in openJPEG. As I've suspected all along, the entire SL infrastructure isn't prepared to handle lossless textures. This may be related to
Okay, I've determined for certain what the source of this bug is:
https://lists.secondlife.com/pipermail/sldev/2007-December/007106.html Basically slviewer tries to download partial j2k textures by making assumptions about the compression ratio. This breaks down with lossless textures, the viewer does not download enough of the texture for the resolution level it is interested in, then hands it off to OpenJPEG, which of course barfs on it. Note that this is in common code, as Unfortunately a full fix for this is does not appear to be trivial to do. Patching the client to assume a .5 compression ratio gets rid of the visible corruption, but has strange side effects. Presumably this causes the client to download way more of the lossy textures than it needs. Unfortunately there's no obvious way to determine if you're dealing with a lossless texture in advance and adjust the ratio assumption, as this would require parsing the j2k header, which is exactly what the current implementation seems to be going out of its way to avoid doing. The current method of assuming the compression ratio is total crackrock. As far as I can see, the true fix is to use j2k the way it was designed. Download the j2k header, parse it, and download the data it actually needs, rather than making wild hardwired guesses. I can probably figure out how to do this with OpenJPEG but it will probably break KDU and require someone with access to the KDU source to fix it up... sounds like a plan, go for it Seg.
The "top corner jobs" are a bug in OpenJPEG and only show if a full resolution is requested (opj_dparameters_t.cp_reduce set to 0).
As already stated, it is really a sideeffect of trying to decode a truncated codestream that does not have the necessary data in it and you will never see it if a complete full image is used. The reason that it only shows for some images seems to be connected to the order of data packets in the codestream. If a full resolution decode is requested OpenJPEG will only decode the highest resolution level it has encountered so far. If a codestream puts higher resolution data at the end of its codestream it's possible that due to truncation that part is never encountered so the internal maximum resolution will get stuck at the previous level. So even if a full resolution image is requested it will be decoded at half the resolution, leaving the rest of the allocated image buffer filled with garbage. A preliminary fix for openjpeg is pretty simple but of course I'll see to put this upstream to the OpenJPEG crowd. I'll attach it if anyone wants to test it (you need to rebuild OpenJPEG for this unfortunately). Anyway, this is really just a side effect and only leads to better error resilience in OpenJPEG. You are a god. Sounds like this should bring OpenJPEG in line with KDU's behaviour, as seen on
Haven't tried the patch yet though. I'll pound on it with my test suite momentarily... Attached is my workaround for this issue. Carjay has added some defense into openjpeg and i've added some defense into the texture fetcher. Some of my patch would still be needed to obtain full resolution images due to a SVR bug that is also related to this mess even if you use a patched openjpeg.
It appears that the server is sent a discardlevel as part of the image request and it also appears to use this to judge how much data to send. After this data is sent the packets only update on 10-15s timer which leads to maps etc taking 20 minutes to download. My patch does the following ;- 1) When downloading textures if we have reached discard level 0 (full res) but there is data remaining the fetcher is kept running and the image will not be passed to decode until its all present. 2) When fetching from the cache, if we are on discard level 0 the image is kicked into the SIM/NET state machine, as due to the fundamental underlying issue the amount of data read from the cache will be wrong so the extra data needs to be downloaded again sigh (or another patch could rewrite half the cache to solve this). 3) If either 1 or 2 occur, a flag is set that ensures the remaining texture is downloaded at a more reasonable rate. Results. Well i can actually browse the map now, no top corner maps for me and they don't take 20 minutes. Some do seem to correctly load back from cache, others are a level down and it takes 30 seconds to catch up again. Good luck! All good progress...
Looking at the 'hackery' patch, I'm a bit fuzzy on how losslessness is relevant (is lossless being used to describe a single-resolution stream?). The variable name lossless may be misleading.
Basically its valid for any stream where the expected cut off was not based on the w*h*c*rate where rate=0.125 calculation. This is often a lossless texture as the stream is much longer than expected Looking at the comments to Carjay's patch on the openjpeg forum, i would guess that within the viewer we should be ALSO looking for j2k/jpc termination markers in processsimulatorpackets() and using that to judge when to hand off to decode. Would have to do something similar in the cache recall too but that a little nasty at present with out quite a few changes to the cache code too. I would name the variable mDesiredSizeGuessWasAnEpicFailure but that's just me.
Qarl just posted a patch over on Qual's patch may in fact be complimentary to mine, looks like its fixing the texture fetch from the cache, in which case #2 of my patch should be replaced with it. Need to see patch in its applied context to really understand it which i can't do right now.
I think Qual had a (inline in comments) patch on 2404 that did what my #1 does but this has been lost in the mele and never applied. But this part still is effected by the SVR bug previously described and solving this involves the sever parsing the j2k file before sending so it knows where the termination markers are for each discard level and not making wild assumptions. Actually your suggested variable name explains the variables purpose a whole lot better than my name, which is good coding practice The fact that KDU does not show top corners (and now Carjays patch brings openjpeg into line) is no excuse for the viewer to feed the (either) library duff data and the correct parsing and detection of the discard markers is the correct way to solve this client and server side. I tested both Michelles and Qarls patches now and like the results.
One more patch from me: since the OpenJPEG maintainers seem not to be too positive about the approach the sl viewer is taking, I went and fixed the check that has already been present in the viewer and which was supposed to check if a decode went wrong resp. did not receive enough data. The check as currently implemented can never work since the "factor" field that was used is actually the original factor as set by the user (so it contains what you requested not what was actually decoded). I have verified that the check now actually works (no top corner jobs even with OpenJPEG unpatched), but after applying Qarl and Michelle's patches it does not seem to be necessary really. I hardly if ever saw a decode fail. The patches from Qarl and Michelle really seem to be complementary since I could leave out either one and the decode failures did not show up even when opening up the world viewer (which used to be the easiest way to trigger this condition). Oh, if you apply my patch, be aware that unfortunately the current OpenJPEG library contains a small memory leak that will show if you use it (the library doesn't completely clean up the additionally requested codestream information which is necessary to check the number of decomposition levels that were decoded) and which leads to about 20 leaked bytes for each decoded image. I pushed the patch to fix this upstream and hope it gets applied soon. Memleaks through defensive coding aren't really what I have in mind usually. The attached patch also contains the one from Right, found an appropriate SVR issue for this titled "World Map download unnecessarily slow...." Will put comments on SVR bug there. Linked as related to
I have not seen this issue for a while now, ever since Qarl's patches hit the release branch of the viewer.
There are related issues due to texture download speed but this specific issue appears to have been resolved for a while. So i'm declaring it closed, feel free to reopen if you can reproduce on any new viewers and no one has commented in a long time. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This one is in a custom viewer based on 1.18.0.6