I always ask there if I can't find what I'm looking for.
Here are more and more data sets. These are general data sets. Email me if you have a specific data set in mind (e.g. web-as-corpus, spam, images, social, reviews, etc.). I have a big file of information.
Good instructions:
http://corpus.leeds.ac.uk/internet.html#description
http://sslmit.unibo.it/~baroni/bootcat.html
http://www.drni.de/wac-tk/index.php/Documentation
I always ask there if I can't find what I'm looking for.
Here are more and more data sets. These are general data sets. Email me if you have a specific data set in mind (e.g. web-as-corpus, spam, images, social, reviews, etc.). I have a big file of information.
Web as corpus: etc. Email me if you need more http://cleaneval.sigwac.org.uk/ http://liste.sslmit.unibo.it/pipermail/sigwac/2007-November/... http://wacky.sslmit.unibo.it/doku.php?id= http://clic.cimec.unitn.it/marco/research.html