I use Java with simple task queue and multiple worker threads (scrapy is only singlethreaded, although uses async I/O).
Failed tasks are collected into second queue and restarted when needed.
Used Jsoup[1] for parsing, proxychains and HAproxy + tor [2] for distributing across multiple IPs.
Doesn't ThreadPoolExecutor take care of all of that if you store the returned Future from the submit method? Then you just have the main thread wait for those.
[1] https://jsoup.org/ [2] https://github.com/mattes/rotating-proxy