[SPARC64]: Fix bugs in SUN4V cpu mondo dispatch.

There were several bugs in the SUN4V cpu mondo dispatch code.

In fact, if we ever got a EWOULDBLOCK or other error from
the hypervisor call, we'd potentially send a cpu mondo multiple
times to the same cpu and even worse we could loop until the
timeout resending the same mondo over and over to such cpus.

So let's bulletproof this thing as follows:

1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
   in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h

2) Don't build and update the cpulist using inline functions, this
   was causing the cpu mask to not get updated in the caller.

3) Disable interrupts during the entire mondo send, otherwise our
   cpu list and/or mondo block could get overwritten if we take
   an interrupt and do a cpu mondo send on the current cpu.

4) Check for all possible error return types from the cpu_mondo_send()
   hypervisor call.  In particular:

   HV_EOK) Our work is done, all cpus have received the mondo.
   HV_CPUERROR) One or more of the cpus in the cpu list we passed
                to the hypervisor are in error state.  Use cpu_state()
                calls over the entries in the cpu list to see which
		ones.  Record them in "error_mask" and report this
		after we are done sending the mondo to cpus which are
		not in error state.
   HV_EWOULDBLOCK) We need to keep trying.

   Any other error we consider fatal, we report the event and exit
   immediately.

5) We only timeout if forward progress is not made.  Forward progress
   is defined as having at least one cpu get the mondo successfully
   in a given cpu_mondo_send() call.  Otherwise we bump a counter
   and delay a little.  If the counter hits a limit, we signal an
   error and report the event.

Also, smp_call_function_mask() error handling reports the number
of cpus incorrectly.

Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
David S. Miller
2006-02-28 15:10:26 -08:00
committed by David S. Miller
parent aac0aadf09
commit b830ab665a
3 changed files with 159 additions and 55 deletions

View File

@@ -1795,3 +1795,31 @@ sun4v_cpu_yield:
ta HV_FAST_TRAP
retl
nop
/* %o0: num cpus in cpu list
* %o1: cpu list paddr
* %o2: mondo block paddr
*
* returns %o0: status
*/
.globl sun4v_cpu_mondo_send
sun4v_cpu_mondo_send:
mov HV_FAST_CPU_MONDO_SEND, %o5
ta HV_FAST_TRAP
retl
nop
/* %o0: CPU ID
*
* returns %o0: -status if status non-zero, else
* %o0: cpu state as HV_CPU_STATE_*
*/
.globl sun4v_cpu_state
sun4v_cpu_state:
mov HV_FAST_CPU_STATE, %o5
ta HV_FAST_TRAP
brnz,pn %o0, 1f
sub %g0, %o0, %o0
mov %o1, %o0
1: retl
nop