#8650: KDL launching WebPositive development version ----------------------+---------------------------- Reporter: aldeck | Owner: nobody Type: bug | Status: new Priority: high | Milestone: R1 Component: System | Version: R1/Development Resolution: | Keywords: Blocked By: | Blocking: Has a Patch: 0 | Platform: All ----------------------+---------------------------- Comment (by bonefish): Replying to [comment:8 anevilyak]: > Replying to [comment:5 bonefish]: > > So maybe it's not an unsafe syscall after all, but the faults are caused by a stack corruption with an entirely different cause. Syscall tracing might help to clarify that point, but I don't really have the time to play with that ATM. > > That case unfortunately appears to be the more likely cause. I've gone through and reviewed all usage of user_strlcpy in the kernel, and there doesn't appear to be a tangible culprit to fit the bill as far as that's concerned. Have you checked the `user_memcpy()` uses as well? There are significantly more of those. A kernel stack address of 0 is not permissible, but the address specification is `B_ANY_KERNEL_ADDRESS` in this case, i.e. the address passed in is ignored. I don't think mapping the stacks has anything to do with the problem. It's just the last VM operation when creating the thread. If you want to track this further, I'd recommend the following: * Add a `panic()` right after the `dprintf()` we seen in the syslog ([http://cgit.haiku-os.org/haiku/tree/src/system/kernel/vm/vm.cpp#n4037]). When I tested it, the stack was already corrupt at that point, so no reason to continue. * Enable kernel tracing for threads and syscalls. This helps to see the syscalls after the creation of the thread until its demise. Whether the cause is a `user_memcpy()` or not, it is very likely that the corruption happens in a syscall or code called from it. To help with checking the `user_memcpy()` theory add a `ktrace_printf()` in it. Since `user_memcpy()` in syscalls usually copies from or to the kernel stack, it is probably not easily possible to see whether stack is overwritten erroneously. But maybe something catches the eye (like source and destination addresses both being kernel addresses or a huge copy size). * If the thread overwrites its own stack, then likely the last syscall is to blame (well, there could be an interrupt where this happens, but interrupts happen all the time). Inserting `ktrace_printf()`s at some places and enabling stack traces for them (need to be long enough), should help to narrow down when exactly the stack gets broken. * If some other thread overwrites this thread's stack or if the heap is corrupted, things are more complicated. A possible strategy is to try and find out what is corrupted and add respective sanity checks at various places. This would at least narrow it down. Hopefully some more useful lead presents itself. -- Ticket URL: <http://dev.haiku-os.org/ticket/8650#comment:9> Haiku <http://dev.haiku-os.org> Haiku - the operating system.