In August 2007 we launched some new applications for small ads (they cover jobs abroad, housing rentals and property and standard classifieds). They have grown well in usage and there are now over 50,000 ads posted across the different areas. We are really happy with the progress, but this growing success has created a new problem – masses of bad posts. This was not a complete surprise as our previous ads system was also affected. What changed was the volume.
Bad posts will normally fall into one of the following categories (if you want to find out more about this, check out our article Stay safe online):
- Fraud: current favourites of the spammers include: imaginary flats for rent, puppies/parrots/etc for ‘free’ adoption and bargain-priced luxury cars.
- Spam: standard stuff like pron, pills and multi-level marketing schemes.
- SEO rubbish: people putting up ads of little relevance to our users (normally in the wrong country, category, etc).
All of these things are detrimental to the user experience we provide. Before we launched, we did a fair amount of work to automatically identify bad posts. To a point this worked, even if it did need some manual work to look through. This is OK for the posts we could automate for. The big problem is the fraudsters are not stupid, so they have upped their game and are adapting texts and stealing pictures from real posts. As these ads are being submitted manually and the content is the same as what is being written by real users, it is hard to identify automatically. We do have many users reporting fraud when they see it, but we see this as a second line of defence – ideally people should not see bad posts in the first place.
When looking at the fraudulent activity and combining geolocation into the data, we found some obvious trends in there and this is where the [non politically-correct] name of The ‘Nigerian Problem’ comes from. 98% of the activity originating from there was identified as fraudulent. This country is not alone as Benin, Côte d’Ivoire and Ghana feature highly – there is also lots of bad activity originating in Europe (strangely Sweden seems to be a hotspot) and the US, but this is less in terms of percentages.
So this correlation is useful, but how can we use it to help prevent fraud on websites? One internal suggestion [even less politically-correct] was to just block posts from West Africa. Obviously this is the equivalent of sweeping the dust under the sofa – it doesn’t really solve anything – which creates the new problem of excluding some of the poorest and most disadvantaged people on the planet. No, I don’t think Nigerians are ‘bad’ people – the wealth disparities and endemic corruption in what should not be a poor country are some of the root causes of this problem. This is not a ‘Nigerian Problem‘, it is an Intenet Problem. Everyone involved in Internet services needs to get better at working out how to block bad people. This is a hard problem.
Our end conclusion came to that we would have to factor geolocation into the process we have for sifting out bad posts, as this will help as an indicator. We also realised we are going to need to get a lot better at doing this in general if the amount of activity keeps increasing by the current rate of 5-10%/week.