[haiku-bugs] Re: [Haiku] #12319: Time/NTP sometimes crashes in gethostbyname()

  • From: "ttcoder" <trac@xxxxxxxxxxxx>
  • Date: Tue, 25 Aug 2015 12:40:46 -0000

#12319: Time/NTP sometimes crashes in gethostbyname()
----------------------------------+----------------------------
Reporter: ttcoder | Owner: nobody
Type: bug | Status: new
Priority: normal | Milestone: R1/beta1
Component: Network & Internet | Version: R1/Development
Resolution: | Keywords:
Blocked By: | Blocking:
Has a Patch: 0 | Platform: All
----------------------------------+----------------------------

Comment (by ttcoder):

In light of yak's analysis in the other ticket, it occurs to me that maybe
gethostbyname() works on some sort of 'static' (team-wide common) data, so
if a thread executing gethostbyname() is killed in the middle of it, it
could leave it in an inconsistent state, and all further calls to
gethostbyname() would fail or crash ?

Does that make sense, I'm not sure If so, for future reference, is there a
"re-entrant" version of gethostbyname() (somewhat like `localtime()` has a
`localtime_r()` re-entrant counterpart). ?

I'm asking because I tried two "torture tests" right now:
- first worked with Haiku's ntp.cpp, just adding a main() that invokes it
synchronously in an infinite loop, while slowing down my LAN with misc
trafic. Could not make it crash.
- then I modified it to invoke a-synchronously (i.e. in a separate thread)
and kill said thread almost immediately, and indeed gethostbyname() is
hosed quickly:

{{{
KERN: debug_server: Thread 7921 entered the debugger: Segment violation
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0xa, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0xa, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0x4, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0xa, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0xa, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x2e323000 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0x2e323931, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault
at 0xa, ip 0x80136e83, write 0, user 0, thread 0x1ef3
KERN: vm_soft_fault: va 0x0 not covered by area in address space
(..etc..)
}}}

- If I increase the delay from 50000 µs to 100000 µs I get the exact same
stack crawl as at the top of this ticket.



Anyway I'll obviously re-organize the code which is deffective due to that
unprotected kill_thread call, but I'll only feel 100% reassured if I also
can use a re-entrant version of gethostbyname :-)

--
Ticket URL: <https://dev.haiku-os.org/ticket/12319#comment:1>
Haiku <https://dev.haiku-os.org>
Haiku - the operating system.

Other related posts: