Path finding cannot be parallelized on the GPU. Every thread has to execute the same instructions. Only the data can vary, not the control flow. (Though conditional instructions enable if/else by treating whatever branch that isn't taken is executed as a NOOP.)
I did not comprehend what the article was trying to say Java 9 would have. Would it be an API or would it work with old code where things like for loops are vectorized?
Good point... but I think that branching code can be processed on the GPU (OpenCL talks about 'wavefronts' for groups of threads on the same branch), it's just that for every branch the GPU acceleration gets much less effective. I think the AI in the latest Civ was GPU accelerated, and that had to have a bit of branching.