[haiku-bugs] Re: [Haiku] #5372: USB errors on Aspire One Netbook / Intel 82801G

  • From: "mmlr" <trac@xxxxxxxxxxxx>
  • Date: Thu, 25 Feb 2010 23:42:18 -0000

#5372: USB errors on Aspire One Netbook / Intel 82801G
-------------------------+--------------------------------------------------
 Reporter:  kallisti5    |       Owner:  mmlr          
     Type:  bug          |      Status:  new           
 Priority:  normal       |   Milestone:  R1            
Component:  Drivers/USB  |     Version:  R1/Development
 Keywords:               |   Blockedby:                
 Platform:  All          |    Blocking:                
-------------------------+--------------------------------------------------

Comment(by mmlr):

 Please don't attach zipped stuff, it's just soo unhandy. Better trim/split
 the log in these size wise edge cases.

 Oh and congratulations: you just won the prize for the worst possible
 interrupt mapping! Everything that has an interrupt is mapped to line 11.
 And better yet as the syslog tells:

 {{{
 KERN: Disabling unhandled io interrupt 11
 }}}

 Which should pretty much render all devices depending on interrupts
 useless: audio, USB, wired and wireless network, even storage would be
 affected if it used the SATA interface (which it doesn't though so falling
 back to legacy interrupts 14 and 15).

 It happens between AHCI and UHCI init. Since AHCI is not in use it's
 unlikely this triggers something, so a possible scenario is that the UHCI
 controller is left in a state with pending interrupts which neither the
 system reset nor the explicit host controller reset clear (which would be
 a quirk). The init of the UHCI driver then enables interrupts after having
 set up the interrupt handler which causes an interrupt storm. Since at
 this point in time UHCI has the only interrupt handler installed the
 interrupt code decides that it isn't a shared interrupt and therefore
 disables the whole line.

 Another possibility though is that the interrupt line gets enabled because
 of the first interrupt handler being installed and some other device is
 actually causing the storm resulting in the same end result. It's a bit
 hard to tell, but since there is explicit "storm protection" by clearing
 disabled UHCI interrupts in the UHCI code this strikes me as the indeed
 more likely case. So even if the UHCI controller did still have pending
 interrupts when enabling them they are cleared at the first interrupt
 handler call at the latest.

 To blame is therefore likely any of the other devices sharing this
 interrupt line. What you can try is to remove drivers for them one by one.
 The thing is that you need to cold boot once you experience the problem to
 make sure you don't just carry over the problem from the last reboot which
 makes things a bit tedious. The process would be to boot, remove a driver,
 if problem already exists cold boot, warm reboot to check if problem comes
 up and continue with removing the next driver if yes. Drivers in question
 would be ehci, hda, intel_extreme, the wired rtl network driver and the
 atheros wireless one if installed. Personally I'd start with removing
 ehci.

 In general this is a pretty nasty problem though because a driver can't
 really do anything about situations like these. It can only control the
 interrupts of its own device, and not reset interrupt states for others.
 Therefore if such a state is present on boot the first unsuspecting driver
 happening to install an interrupt handler will trigger it. A solution
 would be to seperate device init and interrupt enabling into two seperate
 driver calls, which might not necessarily be possible for all device types
 (since some may depend on interrupts for initial device setup).

 Another possible solution would be what Be did where the disabled
 interrupt lines get re-enabled after a while if there is a handler which
 might just get the system go far enough to call the drivers of the other
 devices and therefore clear the problem.

 Also I'm not sure if we actually unload all drivers when shutting down. At
 least the USB host controller drivers don't necessarily clean up after
 themselves properly. Usually this isn't a problem, but in your case it
 might just be. Since the USB stack is B_KEEP_LOADED though I'm not even
 sure whether it is ever destroyed at all giving the drivers the chance for
 cleanups. If it is a good idea would be to additionally disable/clear
 interrupts on host controller driver teardown.

-- 
Ticket URL: <http://dev.haiku-os.org/ticket/5372#comment:10>
Haiku <http://dev.haiku-os.org>
Haiku - the operating system.

Other related posts: