Tuesday 18 August 2009

On programming, wikis and protecting against vandalbots

This post is a continuation/expansion of this post where I describe the creation of a proof-of-concept vandalbot. So go read the first post... shoo! Done? Good, on to how to protect a wiki from automated nasties.

You may think 'why not just block them on sight'? That's all well and good for sites like Wikipedia, where hundreds of thousands of users are monitoring recent edits, any undesirable ones are reverted and the offending user blocked. However, for smaller wikis, the vandalism may go unnoticed for several hours or even days, making the following preventative measures necessary.

The most effective thing you can do is to install the AbuseFilter extension, and then set up rules to throttle edits (only allow X edits in Y minutes) from new/unregistered users. This is very effective, and prevents vandalbots from editing wildly, thus giving admins a chance to see the vandalism and block the bot before much damage has been done. Rules can be programmed to trigger on just about anything, and carry out a wide range of actions when tripped.

However, this is not easy for those inexperienced with MediaWiki, nor is it possible for wikis hosted on external servers (such as Wikia). However, if you can do it, it is the best way to limit vandal/spam bot activity.

If your wiki is quite small, or aimed at a niche community, you could edit LocalSettings.php (assuming you have access to the file, I know Wikia wikis need to have such changes approved) and restrict anonymous (unregistered) users from editing and even restrict new account creation by anyone except admins, thus requiring prospective new users to request an account. This will put off casual vandals, and make creating even a small set of vandalbot accounts difficult (if you suddenly get 30 requests in a day when you usually get 4, something is wrong).

OK, what if you are a wiki admin with no access whatsoever to the low-level settings of your wiki. What do you do to help protect it? Well, you could refer the site administrator to this post (^_^), or do the following:
  • Watch out for the mass-creation of user accounts, especially with nonsense names or incremental names (Dfghsj01, dfghsj02, etc.), and block them if at all suspicious.
  • If a bot does strike and cause mass havoc, fight fire with fire and use a bot (or a bot process running on your account) to undo the damage. I have created an antivandal utility which you can download here (source included (C#), dotnetwikibot included). It allows you to auto-revert a set number of edits by a certain user. It's very user-friendly and relatively fast.
In all probability, your wiki will never come under fire from a malicious bot - especially if you implement the preventative measures - but if it does, at least you now know what to do!

UPDATE: If you're interested to see just what malicious bots can do, have a look at this. It's the contribs list of a test bot operating on my recently-created test wiki. The bot created 50 pages in just under 2 minutes, and then 'vandalised' said pages at about the same rate (25epm). I also tried testing the pagemove routine, but Wikia obviously has a throttle to prevent mass-pagemoves.

Monday 17 August 2009

On programming, wikis and a proof-of-concept vandalbot

Disclaimer: I am not to be held responsible for anything you may do with the code samples contained in this post. My 'vandal' bot is a proof-of-concept only, and I am giving an insight into its creation as an intellectual/programming/wiki-management exercise only.

Skip to the bit marked 'Programming ahead' if you already know/don't care about Wikipedia and the background to all this.

Before we get into the business of programming bots of wiki destruction, a little background. I used to be quite an avid Wikipedia editor/bot coder, up until early this year when the immense levels of bureaucracy and drama became too much to cope with. For those unaccustomed to Wikipedia, and therefore do not have the faintest idea what I am on about (few people do anyway), here's an analogy. Think of a group of teenage girls crossed with a government department, and that's a pretty good idea of what we're talking about. Don't get me wrong, WP is an amazing resource, but surely you can write an encyclopedia without having to have a week-long process for even the simplest thing?

Anyway, back to the main event. Wikipedia - and all wikis for that matter - have a persistent problem with vandalism. Usually this consists of puerile comments added into articles (usually high-profile ones), page blanking or link spam. However there are a small group of extremely determined vandals (especially on high-profile, high-traffic sites) who edit vast numbers of pages in a very short time, often replacing their content with shock-site images or mindless crap designed to crash browsers (huge images, malformed HTML, etc). Another breed moves (renames) pages en-mass to names such as ON WHEELS, or variations on HAGGER???? (the favourites of vandals known as 'Willy on Wheels' and 'Grawp' respectively).

One day about a month ago when I came across some of this 'vandalism-on-steroids', I wondered: how do they do it? And thus the vandalbot experiment began. I figured that they had to be using some kind of editing bot to do the job, based on how many edits per minute they were making. 30-40 epm is far too high a rate for even the most determined idiot. So, using my (then rather rusty) knowledge of bots, I set out to reverse-engineer (in a very loose sense) a method for mass-vandalism.

Warning: Programming ahead!

For this project, I used Visual C# Express 2008 and the dotnetwikibot API, both are free and easy to use.

I'm not going to go into the intricacies of using VC#, or programming using dotnetwikibot. There are plenty of tutorials and examples out there, and I am not writing a 'how to make a awsum vandalbot to pwn wikipedia!!!!111' for all the script kiddies out there. This is about finding out how they can be used to the detriment of wikis, and how to protect against similar bots. To those with experience with wikibots, this will seem pathetically simple, and it is. However, for wiki admins without such experience, knowing how to protect against vandalbots is likely to be a complete unknown.

My first thought was that at its simplest, a vandalbot must take a list of pages, and edit them without human interaction, much like any editing bot, except this one will hinder, not help the wiki. With that in mind, I wrote my 'draft' vandalbot code.

site = new Site("http://en.wikipedia.org/wiki/", "botname", "botpass); //Define the site to edit, and the bot's username and password
PageList pl = new PageList(site);
pl.FillFromAllPages("a", 0, true, 100); //Get the first 100 pages starting with 'a'
foreach (Page page in pl) //Iterate through the pages in the list of 100
{
page.Save("Cheese!", "Eek!"); //Save each page replacing all content with 'Cheese!', with the edit summary 'Eek!'
}

This (when wrapped in an actual program) worked, and my tests (on a test wiki, 'experienced mad programmer on a closed wiki') showed that the bot could reach 12-14 epm quite easily. With refinement, the code could probably be made to reach 20+ epm.

Something more destructive than simply editing pages is creating new ones. Admin tools are required to delete pages, and this will lengthen cleanup time and increase disruption. This requires the pl.fill... line to be changed to
pl.FillFromFile(@"P:\ath\to\a\text\file");

The file should contain page titles, one on each line. The program will create each page in turn, with the same text and edit summary as before.

The final form of mass-vandalism, page moves, can also be executed by this program, by changing the page.save... line to
page.RenameTo((page.title + " a suffix"), "Move summary");

I tested all of these bots on the test wiki, and let me assure you, in the hands of even a programming-dunce vandal, they could cause mass damage to wikis. What's even more worrying, is if several vandals all launched an attack at once. The editing routine could obliterate a small wiki within a couple of hours, and a co-ordinated attack could do the same to even a larger site:
Small scale:
15epm with 10 vandals starting at different offsets = 150epm
150epm on a 10,000 page wiki (e.g. medium sized Wikia): ~67 minutes to vandalise every .page, assuming no reversions

Large scale:
15epm with 500 vandals (not unrealistic, think 4chan) = 7500epm(!)
7500epm on a 1,000,000 page wiki (largish Wikia, Wikipedia): ~134 minutes to vandalise every page, assuming no reversions.

No small mess for wiki admins to clear up, especially without a bot or any reversion scripts.

EDIT: The section about protecting against vandalbots has been split off into its own post here. After further research, I felt that it deserved more in-depth coverage.

Whew... long post. Hopefully this has given you an insight into wiki (vandal) bots, and how to protect a wiki from the robot hordes! It was certainly an interesting experiment for me (I ended up adding a GUI, options to select which mode the thing operates in, textboxes for the page text and summary, and a log window, because I'm like that). Just keep in mind, I don't condone wiki vandalism, and I'm not responsible for anything that you may do after reading this article i.e. creating an army of vandalbots, getting blocked from every wiki from here to Timbuktu, falling into a singularity, etc.

Monday 1 June 2009

On zombies

In a small departure from the usual rational, scientific content here...

53%
... which isn't that bad. Seeing as I don't own a gun (illegal in Britain), this cuts my survival chances quite a bit (although I am good with improvising weapons, killing zombies with a frying pan with a nail attached to it, anyone?).

Seeing as when (when, not if) the dead rise we all need to be prepared, here are my top three tips for surviving Z-Day, based on common zombie fallacies (read: the following happen in all cheesy zombie flicks).
  • DON'T under any circumstances go looking for loved ones or friends. It may seem harsh, but they may already be zombified, and it will distract you from your primary goal. Survival.
  • DON'T set zombies on fire, unless they are dead (as in, really dead), or you have contained them (i.e. trap them in a building, then blow it up). The only thing worse than a zombie trying to eat your brains, is a zombie trying to eat your brains while aflame. Fire will not kill zombies quickly, as they need only about 1% of normal human bodily functions. However, if you are disposing of zombie corpses (as you should to prevent infection) fire is a good method of doing so.
  • Finally, if you are in a group DON'T, for the love of brains, split up. Single prey is much easier to surround and kill than a large group (why d'you think wildebeest and the like travel in hundred-strong packs?). You can watch each others' backs and get a better all-round view of things if you stick together, too.
Good luck, stay informed, stay alive.

Tuesday 26 May 2009

On love, logic and neurotransmitters

Is it just me, or does love seem to set even usually right-thinking people on the fast track to stupidville? People seem to have the idea that love is something ethereal, something that only poets truly understand. The cultural fallout from this notion is everywhere ('true love', 'love at first sight', etc.).

I must have heard the words 'but it's love, you can't explain it with logic and science' (love is all just hormones and neurotransmitters, people) a million times, and given the requisite explanation a million more. And don't get me started on the girl who told me, after she overheard me talking to someone else, who shares my views on this (they do exist!), 'science takes all the emotion out of everything, it shouldn't be allowed to mess with love', and then, when I tried to explain, 'I don't care about science, it doesn't matter'. When scientists save your life one day, you may think differently.

Can we please, as a society, move on from putting love on a pedestal as this untouchable, pure, poetic thing? It's a remnant of the evolutionary pair-bonding imperative, which exists to maximise the continuation resilience of the species. Nothing more, nothing less. Repeat after me: love is all just hormones and neurotransmitters...

Sunday 17 May 2009

On angels, demons and antimatter

Unless you've been living on another planet for the past month or so (and I wouldn't blame you if you had!), you have probably heard about the film adaptation of Dan Brown's Angels and Demons. If you like your films, you've probably also heard from most critics that it is rubbish. Well, in my opinion, you heard wrong. Very wrong indeed.
(Minor spoiler warning, although most of the things I'll discuss you probably already know)

Although the script does deviate from the book at the beginning quite substantially (i.e. Langdon never goes to CERN in the film, and Max Kohler is absent), for those who haven't read the book, the film's sequence of events does set things out more logically. The middle and end follow the book fairly closely, and the ending sequence with the antimatter 'bomb' was just as spectacular as I thought (hoped?) it would be.

I feel that a lot of critics have become too used to seeing either a) films with no intellectual content at all, and are thus surprised to see something that at least tries to educate it's viewers; or b) films with so much emotion and high-class cinematography that they've forgotten what a good mystery film is.

Now, Angels was never going to be 100% scientifically accurate, I mean if you took all the antimatter ever made in any particle accelerator, it would barely add up to a billionth of a gram, let alone the amount seen in the film (or mentioned in the book, which I have indeed read). Secondly, we don't know exactly what antimatter looks like, but I highly doubt that it looks like blue plasma! However, the ability of an antimatter weapon to level a city is most certainly true, although not in the way most people think. The reason that it is so powerful, and so efficient is that 100% of the mass of the particles involved is converted into energy. The energy density (amount of energy per kilogram) of antimatter is about 4 orders of magnitude (10,000x) greater than that of conventional nuclear fuel.

Taking the amount of antimatter in the container shown in the film to be about 10 grams, on total annihilation with the same amount of matter it would release about $1.8\times10^{15}$ joules of energy; this is about 430 kilotons of TNT. For comparison, the 'Little Boy' bomb dropped on Hiroshima in 1945 was about 15 kilotons, and that virtually obliterated the city centre. However, this calculation assumes that all the mass is converted to gamma rays, which then superheat the surrounding matter extremely fast, leading to the devastating blast. The most likely candidate particle(s) for an antimatter bomb would not lead to this.

When antiprotons and antineutrons annihilate with regular protons and neutrons, around 60% of the energy produced is taken away by neutrinos, which do not interact with matter in any appreciable way, and therefore the energy is 'lost' (at least in terms of explosive yield). Taking this into account we get: $(1.8\times10^{15}J)\times0.4 = 7.28\times10^{14}J$. This is about 172kt of TNT, still a big explosion, but not the earth-shattering blast most people would expect.

In conclusion, Angels and Demons is a fantastic film, especially if (like me) you're a fan of historical or semi-historical mysteries. The scientific inaccuracies don't really bother me all that much, it was never designed to be a particle physics film. Although I do think they should have given a bit more explanation as to exactly what antimatter does, and why it is so explosive.

Friday 8 May 2009

Why would Paris Hilton be safe in a zombie apocalypse?

Answer: They want braiiiinnnns.

And she certainly doesn't have one. London's 'Metro' newspaper has a report detailing how, through legal documents filed as part of a court case, it has come to light just how air-headed the socialite really is.

Apparently she 'gets a new cell phone, like, every two weeks' and has never seen a phone bill in her life. She was a producer for the 2006 flop 'Pledge This!' (and it is this that the whole court case is about, according to the plaintiffs she failed to promote the film effectively), but when asked about this role, she didn't even know what a producer does! 'Help get cool people in the cast', apparently.

My main issue isn't really with morons like Hilton, if they want to stay stupid and aren't interested in the world around them, fine. However, what really annoys me is that they (either actively or passively) encourage young people to follow in their footsteps! 'Oh, I don't care about science or other subjects because I'll just become a socialite instead of getting a job!'. This is mainly girls, although footballers do exactly the same to teenage boys' brains.

I'm sure that when a new medical procedure or chemical (developed by scientists; those who do care) saves their lives or those of their family, they'll be singing a different tune.

Friday 1 May 2009

Swine flu: fact or pigswill...

Disclaimer: I am not a medical professional, nor do I pretend to be one, and this post is not intended as medical advice. If you show any symptoms of swine flu, be sure to follow the procedures for your country (in most cases this means calling your GP).

The world (well, most of it) is currently embroiled in swine flu madness. The virus has now been reported in 14 countries (4 May 09 update: 20 countries now), is showing an ability to spread amongst humans, and Mexico has "begun a five-day shutdown of... non-essential government services and businesses" in an attempt to control it.

As is to be expected, amongst all the excellent information and advice out there, there is a lot of nonsense and rubbish (I'm looking at you, Daily Sun).
Here are the top three swine flu myths, and the corresponding scientific truths:

Myth: Swine flu is a killer virus, and could kill millions of people (i.e. 1918 Spanish flu)
Reality: Although ~160 people have died in Mexico of swine flu, there have been no deaths in more developed countries (the US death was a Mexican child that had crossed the border). This is mostly due to better healthcare and early treatment. We didn't have anything like Tamiflu (oseltamivir) or Relenza (zanamivir) in 1918, nor did we know as much about transmission methods. It is likely that many people will get sick, but unlikely that many will die, unless they have compromised immune systems (i.e. the elderly, people who are HIV-positive or undergoing intensive cancer treatment).

Myth: You can get the virus from eating pork from infected pigs.
Reality: Err... no, there is no evidence that the flu virus can be passed to humans from pig products as long as the meat is heated to above 70oC (and pork is usually cooked at around 200o). This fear about 'infected pork' is why the WHO refuses to call it 'swine flu' any more, they now use the term 'Influenza A(H1N1)', the 'scientific' name for this strain as it were.

Myth: We should all start wearing face masks to avoid getting the flu.
Reality: Not really. Although the UK Department of Health has ordered crateloads of face masks for doctors and other health workers, these are special masks with especially small holes to trap the tiny virus particles. The regular blue face masks that people have been buying are next to useless, especially when they get moist from your breath. Covering your mouth when coughing and sneezing (preferably with a disposable tissue) is a better way of minimizing the spread of the virus. Face masks also give a false sense of security, and may lead to people not practicing proper hygiene.

Hopefully this has helped clear up some of the 'woo' and sensationalism surrounding swine flu, and remember, ask for the science before accepting anything. Extraordinary claims require extraordinary evidence.

Tuesday 28 April 2009

'Oh, come on, you just _have_ to go'

I have been hearing those words quite frequently of late. Why? I will explain. I'm coming towards the end of my secondary school education (11-16, I'm not sure what the US equivalent is), and at the end of the final year, there is a 'prom', or in other words a semi-formal buffet dance affair which almost everyone goes to. Being a rational, scientifically minded person does not make school life easy, as ~80% of teenagers treat you as a different species. I'm generally well-liked by people, but I'm no socialite. By now you should be getting an idea of what I'm driving at!

In essence, I can think of many things that I would rather do than pay to go along to something I have next to no interest in (read, watch Star Trek/Doctor Who, surf the web, work on my time machine, etc.). People at school seem intent on persuading me to go, using arguments such as 'it's the last time you'll see everyone' (not true, there's a results day, and a trip for high achievers) and 'it's tradition that everyone goes' (it was tradition for about 1000 years to chop people's heads off and make a public spectacle of it...). To me the whole thing seems rather pointless anyway, and after about half an hour I bet most of the people will wish they are somewhere else! It seems to be the product of clique culture & traditionalism, things which I am decidedly allergic to! I wonder how many people have thought critically about whether they really want to go, rather than just thinking 'well everyone else/my boyfriend/my girlfriend/next door's cat is going, so I have to'.

I feel better for having that little rant!

Monday 27 April 2009

A glimmer of hope...

Television is blamed for an awful lot these days: street violence, bad manners, bad eyesight, bad brains, etc. In some ways this is true, in accordance with Sturgeon's Law, 90% of TV programming is utter crap (Big Brother, [somewhere]'s Got Talent, soaps, any celebrity gossip shows).

However, TV can also be a force for good. There are (at least in the UK, I'm not sure about elsewhere) some truly brilliant science-based, informative shows. Last week I watched one of these, and finally thought 'we're making progress'. In the BBC's 'Professor Regan's Medicine Cabinet', the eponymous Imperial College professor puts various medical myths, from health checks to branded drugs, to the test. The show explains the placebo effect and other important scientific concepts in an easy-to-understand manner, and Regan certainly seems to have her head screwed on air-tight when it comes to scientific rigour (even sending off various homeopathy papers for a thorough statistical analysis).

For someone like me who always follows the science, and doesn't swallow anything (both literally and metaphorically) without seeing the proof, it felt good that proper scientific procedure had come to prime time TV. If at least one person watched the show and stopped using 'complementary' therapies, it did some good.

For those of you in the UK, you can watch the show (and the previous one about diets) on the BBC iPlayer. Those elsewhere with a bit of technical savvy could always google for 'uk proxy' and go from there.

Greetings from Planet Earth!

Ohai! Welcome, wherever or whenever you may be (if you are a time traveller, do remember to visit MIT's Time Travel Convention. I know it's already happened, but then again, you do have a time machine!) . As you may have already picked up, I'm a bit... different from your average Homo sapiens. I love science (specifically physics), and am proud to call myself a geek and a skeptic. For this reason, I am increasingly disturbed by the amount of woo-woo and very bad science floating around today. Here I will output my thoughts and opinions on various things, from skepticism and science to computers, and anything else in between which I find interesting!