> Better solution: pay target-site.com to start building an API for you.
I'd add to this:
Do you really need continuing access? Or just their data occasionally?
Pay them to just get a db dump in some format. For large amounts of data, creating an API then having people scan and run through it is just a massive pain. Having someone paginate through 200M records regularly is a recipe for pain, on both sides.
A supported API might take a significant amount of time to develop, and has on-going support requirements, extra machines, etc. Then you have to have all your infrastructure or long running processes to hammer it and get the data as fast as you can, with network errors and all other kinds of intermittent problems to handle.
A pg_dump > s3 dump might take an engineer an afternoon and will take minutes to download, requiring getting approval from a significantly lower level and having a much easier to estimate cost of providing.
I'd add to this:
Do you really need continuing access? Or just their data occasionally?
Pay them to just get a db dump in some format. For large amounts of data, creating an API then having people scan and run through it is just a massive pain. Having someone paginate through 200M records regularly is a recipe for pain, on both sides.
A supported API might take a significant amount of time to develop, and has on-going support requirements, extra machines, etc. Then you have to have all your infrastructure or long running processes to hammer it and get the data as fast as you can, with network errors and all other kinds of intermittent problems to handle.
A pg_dump > s3 dump might take an engineer an afternoon and will take minutes to download, requiring getting approval from a significantly lower level and having a much easier to estimate cost of providing.