when the cost of virtual functions is considerable to not use it?
This is not an easy question.
I would say when you can avoid it without making the code more complex.
One example would be if you have an array of pointers with many objects (how many? Not sure, I would say 100k+) of objects of different types that are mixed together (not sorted by type), and you need to iterate over them often.
Problem here is that if you call different versions of the virtual function, their instructions will less likely be ready in the CPU cache when they're needed, resulting in increased cache misses. However, if you sort the objects by type in your container you could improve the situation.