• All submissions to this site are governed by Second Life Project Contribution Agreement. By submitting patches and other information using this site, you acknowledge that you have read, understood, and agreed to those terms.
Issue Details (XML | Word | Printable)

Key: VWR-1815
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Normal Normal
Assignee: Unassigned
Reporter: Whoops Babii
Votes: 6
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
1. Second Life Viewer - VWR

OpenJPEG sometimes displaying image in upper left quadrant only.

Created: 19/Jul/07 07:47 AM   Updated: 19/Dec/08 04:34 AM
Return to search
Component/s: Graphics, Source Code
Affects Version/s: 1.18.0
Fix Version/s: 1.20

File Attachments: 1. File openjpeg_check_decoded_levels.diff (4 kB)
2. File openjpeg_top_corner_fix.diff (0.5 kB)
3. Text File top_corner_hackery.patch (2 kB)

Image Attachments:

1. bad_map.jpg
(604 kB)

2. compressed_maps.png
(1.05 MB)

3. Snapshot_001.png
(626 kB)
Environment:
Solaris, Ultra-40M2, (2) AMD Opteron dual core CPUs, 8GB memory
Linux OpenJPEG
Issue Links:
Relates
 

Last Triaged: 20/Nov/08 01:52 PM
Source Version: 1.18.0.4
Linden Lab Issue ID: DEV-24158
Patch attached: Patch attached


 Description  « Hide
Each sim map is being compressed to the upper left quadrant of the in-world map. The remaining 3 quadrants are left grey.

This also manifests with texture uploads at low resolutions such as 32x32.



 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Dale Glass added a comment - 19/Jul/07 08:05 AM
Added a screenshot of the effect

This one is in a custom viewer based on 1.18.0.6


Dale Glass added a comment - 19/Jul/07 08:07 AM
Forgot to say, this is in Linux, using OpenJPEG.

Lex Neva added a comment - 19/Jul/07 09:13 AM
Funky. Does this happen in any of LL's official builds, including first look? If not, this issue may not belong here.

Seg Baphomet added a comment - 19/Jul/07 11:50 AM
This is has been an issue with OpenJPEG since the very beginning! You only see it if you encode with OpenJPEG and then decode with OpenJPEG. You don't see it if you decode with KDU.

It looks like someone switchd whatever it is on the backend that encode the map images to using OpenJPEG! With broken encoder settings. Please see VWR-1475


Seg Baphomet added a comment - 19/Jul/07 11:56 AM
Here's a screenshot. Notice the distinct red/blue color fringing on the low res layer. This may be a bug. And it really should be enabling the MCT to convert the channels to YUV before encoding.

Whoops Babii added a comment - 19/Jul/07 12:49 PM
I tried using the patch for VWR-1475 and the maps still looked compress, see the compressed_maps.png screenshot. I tried the test setting OpenJPEGEncodeLossless set to both true and false with no change.

Seg Baphomet added a comment - 19/Jul/07 01:01 PM
VWR-1475 only effects image encoding and so is not going to fix already encoded images, the images have to be re-encoded.

Seg Baphomet added a comment - 19/Jul/07 01:48 PM
Okay, so here's what I know:

The bug seems to disappear and re-appear for reasons I've been trying to nail down for months and have got nowhere.

The bug manifests itself differently on different architectures. On x86_64, avatar texture baking always seems to fail completely, resulting in your avatar turning grey. Every time, without fail. "Grey avatar" never happens on i386.

It seems to be related to the texture size. Smaller textures, say 32x32 or 64x64, not sure which, seem to not be effected. Possibly related is the fact that you only get the yellow/blue problem once the final, full resolution image is decoded, never before.


Gigs Taggart added a comment - 19/Jul/07 02:20 PM
This is definitely OpenJPEG related as was noted on the mailing list.

I have seen this when uploading sculpt textures, where 3/4th is red and it's in the corner quadrant. It may also be related to the red/green baking artifact in openJPEG.


Seg Baphomet added a comment - 10/Oct/07 09:28 PM
As I've suspected all along, the entire SL infrastructure isn't prepared to handle lossless textures. This may be related to VWR-2404.

Seg Baphomet added a comment - 29/Dec/07 02:22 PM
Okay, I've determined for certain what the source of this bug is:

https://lists.secondlife.com/pipermail/sldev/2007-December/007106.html

Basically slviewer tries to download partial j2k textures by making assumptions about the compression ratio. This breaks down with lossless textures, the viewer does not download enough of the texture for the resolution level it is interested in, then hands it off to OpenJPEG, which of course barfs on it.

Note that this is in common code, as VWR-2404 shows this affects KDU as well. I'm guessing KDU is better at handling corrupt images thus this bug is not as visible with KDU, but it is still there.

Unfortunately a full fix for this is does not appear to be trivial to do. Patching the client to assume a .5 compression ratio gets rid of the visible corruption, but has strange side effects. Presumably this causes the client to download way more of the lossy textures than it needs. Unfortunately there's no obvious way to determine if you're dealing with a lossless texture in advance and adjust the ratio assumption, as this would require parsing the j2k header, which is exactly what the current implementation seems to be going out of its way to avoid doing.

The current method of assuming the compression ratio is total crackrock. As far as I can see, the true fix is to use j2k the way it was designed. Download the j2k header, parse it, and download the data it actually needs, rather than making wild hardwired guesses. I can probably figure out how to do this with OpenJPEG but it will probably break KDU and require someone with access to the KDU source to fix it up...


Strife Onizuka added a comment - 29/Dec/07 11:09 PM
sounds like a plan, go for it Seg.

Carjay McGinnis added a comment - 12/May/08 08:16 AM
The "top corner jobs" are a bug in OpenJPEG and only show if a full resolution is requested (opj_dparameters_t.cp_reduce set to 0).

As already stated, it is really a sideeffect of trying to decode a truncated codestream that does not have the necessary data in it and you will never see it if a complete full image is used.

The reason that it only shows for some images seems to be connected to the order of data packets in the codestream. If a full resolution decode is requested OpenJPEG will only decode the highest resolution level it has encountered so far.

If a codestream puts higher resolution data at the end of its codestream it's possible that due to truncation that part is never encountered so the internal maximum resolution will get stuck at the previous level. So even if a full resolution image is requested it will be decoded at half the resolution, leaving the rest of the allocated image buffer filled with garbage.

A preliminary fix for openjpeg is pretty simple but of course I'll see to put this upstream to the OpenJPEG crowd. I'll attach it if anyone wants to test it (you need to rebuild OpenJPEG for this unfortunately).

Anyway, this is really just a side effect and only leads to better error resilience in OpenJPEG.


Seg Baphomet added a comment - 12/May/08 11:46 AM
You are a god. Sounds like this should bring OpenJPEG in line with KDU's behaviour, as seen on VWR-2404.

Haven't tried the patch yet though. I'll pound on it with my test suite momentarily...


Michelle2 Zenovka added a comment - 12/May/08 04:10 PM
Attached is my workaround for this issue. Carjay has added some defense into openjpeg and i've added some defense into the texture fetcher. Some of my patch would still be needed to obtain full resolution images due to a SVR bug that is also related to this mess even if you use a patched openjpeg.

It appears that the server is sent a discardlevel as part of the image request and it also appears to use this to judge how much data to send. After this data is sent the packets only update on 10-15s timer which leads to maps etc taking 20 minutes to download.

My patch does the following ;-

1) When downloading textures if we have reached discard level 0 (full res) but there is data remaining the fetcher is kept running and the image will not be passed to decode until its all present.

2) When fetching from the cache, if we are on discard level 0 the image is kicked into the SIM/NET state machine, as due to the fundamental underlying issue the amount of data read from the cache will be wrong so the extra data needs to be downloaded again sigh (or another patch could rewrite half the cache to solve this).

3) If either 1 or 2 occur, a flag is set that ensures the remaining texture is downloaded at a more reasonable rate.

Results. Well i can actually browse the map now, no top corner maps for me and they don't take 20 minutes. Some do seem to correctly load back from cache, others are a level down and it takes 30 seconds to catch up again.

Good luck!


Tofu Linden added a comment - 13/May/08 02:47 AM
All good progress...

Looking at the 'hackery' patch, I'm a bit fuzzy on how losslessness is relevant (is lossless being used to describe a single-resolution stream?).


Michelle2 Zenovka added a comment - 13/May/08 03:36 AM
The variable name lossless may be misleading.

Basically its valid for any stream where the expected cut off was not based on the w*h*c*rate where rate=0.125 calculation. This is often a lossless texture as the stream is much longer than expected

Looking at the comments to Carjay's patch on the openjpeg forum, i would guess that within the viewer we should be ALSO looking for j2k/jpc termination markers in processsimulatorpackets() and using that to judge when to hand off to decode. Would have to do something similar in the cache recall too but that a little nasty at present with out quite a few changes to the cache code too.


Seg Baphomet added a comment - 13/May/08 11:08 PM
I would name the variable mDesiredSizeGuessWasAnEpicFailure but that's just me.

Qarl just posted a patch over on VWR-2404. Is it relevant at all? Or it more dead chicken waving? I suppose I should just try it myself...


Michelle2 Zenovka added a comment - 14/May/08 01:06 AM
Qual's patch may in fact be complimentary to mine, looks like its fixing the texture fetch from the cache, in which case #2 of my patch should be replaced with it. Need to see patch in its applied context to really understand it which i can't do right now.

I think Qual had a (inline in comments) patch on 2404 that did what my #1 does but this has been lost in the mele and never applied. But this part still is effected by the SVR bug previously described and solving this involves the sever parsing the j2k file before sending so it knows where the termination markers are for each discard level and not making wild assumptions.

Actually your suggested variable name explains the variables purpose a whole lot better than my name, which is good coding practice

The fact that KDU does not show top corners (and now Carjays patch brings openjpeg into line) is no excuse for the viewer to feed the (either) library duff data and the correct parsing and detection of the discard markers is the correct way to solve this client and server side.


Carjay McGinnis added a comment - 14/May/08 04:27 PM - edited
I tested both Michelles and Qarls patches now and like the results.

One more patch from me: since the OpenJPEG maintainers seem not to be too positive about the approach the sl viewer is taking, I went and fixed the check that has already been present in the viewer and which was supposed to check if a decode went wrong resp. did not receive enough data.

The check as currently implemented can never work since the "factor" field that was used is actually the original factor as set by the user (so it contains what you requested not what was actually decoded).

I have verified that the check now actually works (no top corner jobs even with OpenJPEG unpatched), but after applying Qarl and Michelle's patches it does not seem to be necessary really. I hardly if ever saw a decode fail.

The patches from Qarl and Michelle really seem to be complementary since I could leave out either one and the decode failures did not show up even when opening up the world viewer (which used to be the easiest way to trigger this condition).

Oh, if you apply my patch, be aware that unfortunately the current OpenJPEG library contains a small memory leak that will show if you use it (the library doesn't completely clean up the additionally requested codestream information which is necessary to check the number of decomposition levels that were decoded) and which leads to about 20 leaked bytes for each decoded image.

I pushed the patch to fix this upstream and hope it gets applied soon. Memleaks through defensive coding aren't really what I have in mind usually.

The attached patch also contains the one from VWR-4070 since they are hard to separate (and are both defensive).


Michelle2 Zenovka added a comment - 15/May/08 01:42 AM
Right, found an appropriate SVR issue for this titled "World Map download unnecessarily slow...." Will put comments on SVR bug there. Linked as related to

Robin Cornelius added a comment - 19/Dec/08 04:34 AM
I have not seen this issue for a while now, ever since Qarl's patches hit the release branch of the viewer.

There are related issues due to texture download speed but this specific issue appears to have been resolved for a while. So i'm declaring it closed, feel free to reopen if you can reproduce on any new viewers and no one has commented in a long time.