They are probably just checking headers such as user agent and cookies. Would co...

throwaway81523 · on May 25, 2023

I will try that, but a quick look at the error page makes me think it tries to run a javascript blob.

ksala_ · on May 25, 2023

They're just checking the user agent

    $ curl -s -I 'https://www.sfgate.com/' -H 'User-Agent: curl/7.54.1' | head -1
    HTTP/2 403
    
    $curl -s -I 'https://www.sfgate.com/' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0' | head -1   
    HTTP/2 200

One "trick" is that Firefox (and I assume Chrome?) allow you to copy a request as curl - then you can just see if that works in the terminal, and if it does you can binary search for the required headers.

chrisco255 · on May 25, 2023

It probably does. But there are better modern tools like headless Chrome / Puppeteer that can fully render a page with scripts.