Syscalls speed is critical if you use a hybrid user/kernel space thread blocking/cond-var primitive to do 1:1 threading.