In general, sounds reasonable, but there's a lot of room for variation in how you're "combining" world position with the Sobel response. What you're describing sounds a little like this paper - http://artis.imag.fr/~Cyril.Soler/DEA/NonPhotoRealisticRendering/Papers/p31-northrup.pdf
- where they obtain a bunch of geometric silhouette edges, and then use image space information to link together cohesive chains (so they can make those nice long strokes).
Of course, I'm not really sure if the output you're thinking of uses parameterized strokes like that or if it's just more of an image processing technique. If it's the second one, the problem is a little easier. (let me know what you do and how it turns out :))
Anyway, though, this paper (the one in the post) is a little bit of a different problem, and I just really like it because there's something poetic about synthesizing "hand drawn" lines from input points based on user study information + a model of arm movement for a seated human. It's a neat example of the subset of NPR research that is concerned with learning about the human artistic process through attempting to simulate it.