Skip to content
This repository has been archived by the owner on Mar 28, 2024. It is now read-only.

[BUG-8711] Viewer crashes frequently when using Norman Antivirus Security Suite unless HttpPipelining is disabled. #16299

Open
1 task
sl-service-account opened this issue Mar 7, 2015 · 1 comment

Comments

@sl-service-account
Copy link

sl-service-account commented Mar 7, 2015

This issue is related to the problem with Forticlient+pipelining filed at BUG-8631 but symptoms are different in that there is no obvious visible corruption of mesh & textures and the crash takes longer to reproduce unless you are in a busy area, so I thought it was worth a seperate issue.

Steps To Reproduce

  • Install Norman AV Security Suite - the trial version is fine to reproduce the problem: http://www.norman.com/en/home_and_small_office/trials_downloads/free_trial_for_norman_security_suite_pro

  • Update to the latest virus definitions and reboot the computer.

  • You do not need to change any settings in Norman AV Security Suite after installation - out of the box settings will reproduce the problem.

  • Clean viewer cache and make sure that HttpPipelining is enabled and login.

  • If you get a BSOD as soon as the world starts to render (see "Observed behaviour") reboot and login again.

  • Pay close attention to your surroundings to see if you see any corrupted textures or mesh.

  • If you have not crashed already, teleport to a busy area, for example http://maps.secondlife.com/secondlife/Fawlty%20Towers/45/58/901

  • Cam about and observe surroundings.

  • Once crash has been reproduced, clear viewer cache and disable HttpPipelining and repeat the above steps.

  • Compare viewer logs from a clean cache session with HttpPipelining enabled and a clean cache session with HttpPipelining disabled.

    Observed Behaviour

    HttpPipelining Enabled

  • Sometimes the system will BSOD just after login a few seconds after you see the world start to render.
    See comments section.

  • When you login to a quiet region, everything appears visibly normal - look hard enough and you will see the odd corrupted texture (rainbow coloured) or a mesh that is not loading correctly (stuck as triangles) but the visible corruption is nowhere near as bad as that seen when using Forticlient with Pipelining enabled (BUG-8631).

  • If you stay on a quiet region, the viewer will take a fairly long time to crash - at least half an hour.

  • After teleporting into a busy region, there are still no obvious signs of mesh or texture corruption.

  • Even though everything appears fine visibly, logs tell a different story.
    The same warnings are seen in logs as when using Forticlient with pipelining enabled.

    Lots of these:

    2015-03-07T23:11:22Z WARNING: LLMeshHandlerBase::onCompleted: Mesh response (bytes [1024..3829]) didn't overlap with request's origin (bytes [885..3765]).
    2015-03-07T23:11:22Z WARNING: LLMeshLODHandler::processFailure: Error during mesh LOD handling.  ID:  f592bd1c-efe6-3df8-69d8-c8ac0f705386, Reason:  Invalid Content-Range header encountered (Core_4).  Not retrying.
    2015-03-07T23:11:22Z WARNING: LLMeshLODHandler::processData: Error during mesh LOD processing.  ID:  9821dce4-c490-db72-05f6-4a7d9e29b2b5, Unknown reason.  Not retrying.

    The logs are also always full of KDU blowups

    2015-03-07T23:10:39Z INFO: LLKDUMessageError::put_text: KDU Error: Kakadu Core Error:
    
    2015-03-07T23:10:39Z INFO: LLKDUMessageError::put_text: KDU Error: Illegal inclusion tag tree encountered while decoding a packet header.  This problem can arise if empty packets are used (i.e., packets whose first header bit is 0) and the value coded by the inclusion tag tree in a subsequent packet is not exactly equal to the index of the quality layer in which each code-block makes its first contribution.  Such an error may arise from a mis-interpretation of the standard.  The problem may also occur as a result of a corrupted code-stream.  Try re-opening the image with the resilient mode enabled.
  • The viewer will usually crash within minutes of entering a busy region
    Like the BUG-8631 crash, there is often not a dmp generated when the viewer crashes, you just get a windows dialog saying "Second Life has stopped working..."

    When the viewer does produce a dmp file, the callstack appears to be the same crash as when using Forticlient:

    >	ntdll.dll!_RtlReportCriticalFailure@8()  + 0x57 bytes	
     	ntdll.dll!_RtlpReportHeapFailure@4()  + 0x21 bytes	
     	ntdll.dll!_RtlpLogHeapFailure@24()  + 0xa1 bytes	
     	ntdll.dll!_RtlFreeHeap@12()  + 0x500a0 bytes	
     	kernel32.dll!_HeapFree@12()  + 0x14 bytes	
     	msvcr100.dll!_free()  + 0x1c bytes	
     	msvcr100.dll!__aligned_free()  + 0x17 bytes	
  • See Whirly_HttpPipelining_enabled logs attached for a session ending in a crash (no dmp unfortunately).

    HttpPipelining Disabled

  • No texture or mesh corruption at all.

  • None of the above warnings or KDU blowups in the session logs.
    Note that if you had a cache fetched with pipelining enabled, you will still see the KDU blowups in the logs when Pipelining is disabled if you did not clear cache.

  • Viewer does not crash.

  • No system BSODs

  • See Whirly_HttpPipelining_disabled logs attached for a session visiting the same regions with Pipelining disabled.

    Expected Behaviour

    Not to crash frequently when HttpPipelining is enabled when using Norman Antivirus Security Suite

    Other Information.

  • This problem with Norman Antivirus Security Suite causing Pipelining enabled viewers to crash was initially reported by one of the Firestorm beta testers - see http://jira.phoenixviewer.com/browse/FIRE-15677
    Disabling HttpPipelining also fixed the crashes for this user.

  • I dont think this crash is network related, since I do not crash when using AVG antivirus with Pipelining enabled viewers, but here is my network info just incase:

Network Information

  • Router make: TP-Link

  • Router Model: TD-W8961ND

  • Software version: Firmware Version: 3.0.0 Build 120808 Rel.28888, ADSL Firmware Version: FwVer:3.20.17.0_TC3087 HwVer:T14.F7_11.2

  • IP Address: 88.104.228.205 (not static)

  • ISP: TalkTalk

  • Hardwired to router, standard broadband.

    C:\Users\Whirly>tracert login.agni.lindenlab.com
    
    Tracing route to login.agni.lindenlab.com [216.82.54.28]
    over a maximum of 30 hops:
    
      1    <1 ms    <1 ms    <1 ms  192.168.1.1
      2    19 ms    19 ms    22 ms  88-104-224-1.dynamic.dsl.as9105.com [88.104.224.1]
      3    25 ms    24 ms    24 ms  85-210-252-0.dynamic.dsl.as9105.com [85.210.252.0]
      4    24 ms    25 ms    26 ms  85-210-252-184.dynamic.dsl.as9105.com [85.210.252.184]
      5    30 ms    31 ms    30 ms  host-78-144-9-5.as13285.net [78.144.9.5]
      6    30 ms    30 ms    31 ms  host-78-144-13-34.as13285.net [78.144.13.34]
      7    32 ms    32 ms    32 ms  unknown.Level3.net [212.187.192.113]
      8   164 ms   156 ms   156 ms  ae-1-8.bar1.Phoenix1.Level3.net [4.69.133.29]
      9     *      156 ms     *     ae-1-8.bar1.Phoenix1.Level3.net [4.69.133.29]
     10     *        *      187 ms  LINDEN-RESE.bar1.Phoenix1.Level3.net [4.53.104.14]
     11     *        *        *     Request timed out.
     12     *        *        *     Request timed out.
     13   184 ms   183 ms   158 ms  login-phx3.agni.lindenlab.com [216.82.54.28]
    
    Trace complete.

Attachments

Links

Related

Original Jira Fields
Field Value
Issue BUG-8711
Summary Viewer crashes frequently when using Norman Antivirus Security Suite unless HttpPipelining is disabled.
Type Bug
Priority Unset
Status Accepted
Resolution Accepted
Reporter Whirly Fizzle (whirly.fizzle)
Created at 2015-03-07T22:30:10Z
Updated at 2015-03-09T17:32:52Z
{
  'Business Unit': ['Platform'],
  'Severity': 'Unset',
  'System': 'SL Viewer',
  'Target Viewer Version': 'viewer-development',
  'What just happened?': '.',
  'What were you doing when it happened?': 'Filling in...',
  'What were you expecting to happen instead?': '.',
}
@sl-service-account
Copy link
Author

Whirly Fizzle commented at 2015-03-08T02:19:53Z

Once Norman Antivirus Security Suite was uninstalled and my usual antivirus was reinstalled (AVG), the BSOD's when logging into SL also stopped.

Poking at the BSOD memory.dmp indicates the BSOD was some kind of networking crash - way over my head but seems to be a crash when trying to inject a UDP packet & definitely caused by Norman Antivirus Security Suite.

4: kd> .symfix; .reload
Loading Kernel Symbols
...............................................................
................................................................
................................
Loading User Symbols

Loading unloaded module list
.....

************* Symbol Loading Error Summary **************
Module name            Error
SharedUserData         No error - symbol load deferred

You can troubleshoot most symbol related issues by turning on symbol loading diagnostics (!sym noisy) and repeating the command that caused symbols to be loaded.
You should also verify that your symbol search path (.sympath) is correct.
4: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff8800185f1d8, The address that the exception occurred at
Arg3: fffff88004157358, Exception Record Address
Arg4: fffff88004156bb0, Context Record Address

Debugging Details:
------------------

*** ERROR: Module load completed but symbols could not be loaded for ale7_nf64.sys

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

FAULTING_IP: 
tcpip!IppJoinPath+328
fffff880`0185f1d8 4c396b10        cmp     qword ptr [rbx+10h],r13

EXCEPTION_RECORD:  fffff88004157358 -- (.exr 0xfffff88004157358)
ExceptionAddress: fffff8800185f1d8 (tcpip!IppJoinPath+0x0000000000000328)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000780010
Attempt to read from address 0000000000780010

CONTEXT:  fffff88004156bb0 -- (.cxr 0xfffff88004156bb0;r)
rax=0000000000000001 rbx=0000000000780000 rcx=fffff88004157590
rdx=fffff88004157760 rsi=0000000000000000 rdi=fffff88004157760
rip=fffff8800185f1d8 rsp=fffff88004157590 rbp=fffff880041577b0
 r8=0000000000000000  r9=0000000000000000 r10=fffff880009b3ac0
r11=fffff880041576c8 r12=fffffa800cdf4b40 r13=0000000000000000
r14=fffff88004157ac0 r15=0000000000000002
iopl=0         nv up ei ng nz ac pe cy
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010293
tcpip!IppJoinPath+0x328:
fffff880`0185f1d8 4c396b10        cmp     qword ptr [rbx+10h],r13 ds:002b:00000000`00780010=????????????????
Last set context:
rax=0000000000000001 rbx=0000000000780000 rcx=fffff88004157590
rdx=fffff88004157760 rsi=0000000000000000 rdi=fffff88004157760
rip=fffff8800185f1d8 rsp=fffff88004157590 rbp=fffff880041577b0
 r8=0000000000000000  r9=0000000000000000 r10=fffff880009b3ac0
r11=fffff880041576c8 r12=fffffa800cdf4b40 r13=0000000000000000
r14=fffff88004157ac0 r15=0000000000000002
iopl=0         nv up ei ng nz ac pe cy
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010293
tcpip!IppJoinPath+0x328:
fffff880`0185f1d8 4c396b10        cmp     qword ptr [rbx+10h],r13 ds:002b:00000000`00780010=????????????????
Resetting default scope

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000000000780010

READ_ADDRESS:  0000000000780010 

FOLLOWUP_IP: 
fwpkclnt!FwpsConstructIpHeaderForTransportPacket0+20a
fffff880`01a50066 85c0            test    eax,eax

BUGCHECK_STR:  0x7E

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

LAST_CONTROL_TRANSFER:  from fffff88001927b15 to fffff8800185f1d8

STACK_TEXT:  
fffff880`04157590 fffff880`01927b15 : 00000000`00000000 fffff880`04157760 fffff880`0196d9a0 00000000`00000011 : tcpip!IppJoinPath+0x328
fffff880`041576d0 fffff880`01a50066 : 00000000`00000000 61626364`00000000 00000000`00000000 00000000`00000000 : tcpip!IppInspectBuildHeaders+0x445
fffff880`041579b0 fffff880`04207fc1 : fffffa80`1373a5a0 fffff880`00000014 00000000`00000000 00000000`00000002 : fwpkclnt!FwpsConstructIpHeaderForTransportPacket0+0x20a
fffff880`04157a50 fffff800`03128b8a : fffffa80`0d582b50 00000000`00000080 fffffa80`0ca725f0 fffffa80`0d582b50 : ale7_nf64+0x7fc1
fffff880`04157c00 fffff800`02e7b8e6 : fffff880`009b3180 fffffa80`0d582b50 fffff880`009be0c0 fffffa80`0cfa0c60 : nt!PspSystemThreadStartup+0x5a
fffff880`04157c40 00000000`00000000 : fffff880`04158000 fffff880`04152000 fffff880`04157710 00000000`00000000 : nt!KxStartSystemThread+0x16


SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  fwpkclnt!FwpsConstructIpHeaderForTransportPacket0+20a

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: fwpkclnt

IMAGE_NAME:  fwpkclnt.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  533f5b09

STACK_COMMAND:  .cxr 0xfffff88004156bb0 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_fwpkclnt!FwpsConstructIpHeaderForTransportPacket0+20a

BUCKET_ID:  X64_0x7E_fwpkclnt!FwpsConstructIpHeaderForTransportPacket0+20a

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:x64_0x7e_fwpkclnt!fwpsconstructipheaderfortransportpacket0+20a

FAILURE_ID_HASH:  {d08c9d47-4138-022d-381f-4ae3456cadf9}

Followup: MachineOwner
---------

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant