Approached the traffic filtering from a number of directions - I log and background parse the incoming traffic to look for IPs and ranges that are generating excessive traffic, and get notified of issues. From there I can temporarily/perm block them if desired. I also have a script that can temporarily block all Tor exit nodes, as occasionally I get attempted spamming/scraping from there.
For automating scraping/spamming protection, I would consider something like CloudFlare Security though, my own tools are good enough for now, but I would still be vulnerable to a DDoS or a botnet scrape - it's enough to stop one-man scrapers but I'll need something beefier soon.
Ad formats that seem to work best: 728x90 by a long margin. I generally top and tail the content section with a leaderboard ad. I do serve different sizes for mobile etc, but the leaderboard works way better. I've also experimented with skyscrapers and boxes, which performed very poorly (skyscrapers especially).
Can you talk about this?
Also, how many ads and what formats/sizes work best for you?