Testing is done by extracting the review text, author's other reviews that includes natural language processing, and the reviewer profile variables. As mentioned, main focus was to determine paid reviews that some companies purchase and there are many believe, it or not.
The machine learning implementation stores validated "fake reviews" in the database to keep as a profile to test against. Users can also vote if they think the grade calculated is fair or not, according to their interpretation of the listed reviews. Those votes are also utilized in the machine learning algo.
I saw something similar being claimed by Yelp in one of their promotional videos. https://www.youtube.com/watch?v=jbkQtW8A408 And after watching it had the same doubts that I have now:
Aren't most paid reviews done by agencies which have either 1) An army of employees who have old profiles with several reviews and are trained to give a balanced review? Why is their writing style likely to be any different from real reviewers? There is a high probability of false negatives. This might just weed out the bad freelancers who put no thought in writing such reviews.
Or 2) They will pay real reviewers who are influencers to give them favourable reviews. And these you cannot and should not remove in any case.
Also, What check do you have against false positives?
Yes! I have seen the stuff Yelp is doing. Reviews are definitely integral to Yelp so they must eliminate those fabricated reviews to maintain their reputation otherwise no one would be using their website.
1) You are correct, a large number of the fake reviews are actually a result of agencies that offer their services to manufacture high star reviews to increase their product ranking. Now, these people use bots (or an army of trained monkeys) for most of their review entries and Amazon has in the past detected them. However in the past year or so I've noticed a lot of fake reviews still lingering in Amazon and that is a shame as a lot of the time they mislead consumers. This is why I created the site as I have fallen victim to it too. Just take a look at the health supplements section of Amazon and look at the amount of dishonest marketing some companies are doing.
For case 2), at the present moment Fakespot would be unable to distinguish such a scenario where a reputable reviewer was paid by a company because that would be indeed a "real" review written by a real person. However, a large chunk of the analysis algorithm does look at the language and if the review text is extremely positive and not very detailed to relevant to the product, it will raise a flag.
3) For false positives, the machine learning implementation records votes and modifies itself to eliminate them.
This is a cool idea for Amazon but would be really really useful for TripAdvisor. That thing is so blatantly riddled with some people gaming it. I myself have reported people for being ridiculously obvious about it.
The algorithm analyzes language and many other variables improving over time due to machine learning implementation.
Primary aim is to distinguish fake reviews (aka, reviews that were paid for the purpose of inflating a product ranking) from the legit reviews.