WebScan Boten-Anna

Flowerchucker by Banksy

WebScan Boten-Anna searches the Internet for email addresses. Within an hour, the Java program typically collects around 20k unique emails.

Previously, you could find an extract from the diary of Lieutenant Colonel Mervin Willett Gonin DSO who was among the British soldiers to liberate Bergen-Belsen in 1945 on the webpage of the graffiti artist Banksy. I utilized the list of emails to distribute this text to people from all over the world.

After all, this project is of peaceful nature. The message is non-commercial. I use the emails to promote nothing but the extract from the diary. I will do my best, to send at most one message to any email address.

In the following, I report on how WebScan Boten-Anna functions. The Java program consist of three classes:

An hour after launch
the rates refer to an interval of 2 seconds
It has been my experience
that folks who have no vices
have very few virtues.
Abraham Lincoln

The NodeAnalysis thread parses a single webpage that is encoded in html, the conventional Hypertext Markup Language. The routine will

There is a connect- and a read-timeout of several seconds. The program reads and interprets line after line. Any line is chopped beyond 2kB. Parsing will abort after reading 2k lines or after a total of 50kB. Web resources with extension .php are intentionally not classified as links. I suspect them to circumvent the read-timeout when inquiring them.

BotenAnna is the central thread that creates the instances of NodeAnalysis. Sometimes, up to 70 webpages are parsed in parallel. Whenever a NodeAnalysis thread detects a link, the link is reported to BotenAnna. BotenAnna decides if the link should be analysed in the near future. If so, the link is appended to a list. BotenAnna creates instances of NodeAnalysis based on this list. Initially, the list contains several seed addresses. For instance, one could start with


by Banksy

BotenAnna permanently keeps track of the last links analysed, and emails detected. This memory effect prevents multiple visits of the same webpage. The synchronized data structures employed by BotenAnna are

Potentially, the program screens the Internet for hours. Stability is a main criteria. Therefore, all employed data structures are bounded in size. The use of the Hashtable is to count the web addresses of each domain. If the number exceeds a certain threshold, no more web addresses of this domains are added to the LinkedList. Instead, webpages of other domains are favoured. This mechanism results in a search in breadth rather than in depth.

Making the source code publically available is disputable. However, I would like to distribute the code for educational purposes. After all, the program can do other things than collecting emails. Note that the program does not send emails!

WebScan Boten-Anna * WebScan.zip 65 kB
* To customize the program, please modify the seeds in seeds.txt, as well as the procedure of NodeAnalysis. The name "Boten-Anna" is borrowed from a song by the Swedish DJ Basshunter.

Useful references for this project are:

Ce qu’il faut craindre, ce n’est pas
tant la vue de l’immoralité des grands
que celle de l’immoralité menant à la grandeur.
Alexis-Charles-Henri de Tocqueville

Impulse response

In a first test run, I send the text to 15k unique recipients from all over the world. The emails are sent via smtp, the simple mail transfer protocol from one of my accounts

Philipp Hakenberg <philipp()hakenberg.org>

Anyways, this project promotes the excerpt from the diary and not my homepage. This solution is only temporary. I will switch to another non-anonymous from-address soon.

by Banksy

Sending 15k emails each 4kB in size via this single account takes 6 hours. Meanwhile, I receive about 200 notes of absence or retirement. If an email cannot be delivered, the report of failure is at least 4kB in size. Anyways, any incoming message greater than 1kB is deleted automatically from the inbox. Unfortunately this way I lose track of the number of broken email addresses.

In the following 48 hours, about 100 people visit hakenberg.org. Using Geolocation via IP Address, I determine that most of the visitors come from countries with English as national language. This is albeit quite a number of emails were sent to German emails. To better suit the people from nations all over the world, I asked friends of mine to translate the text into their native language:

An interesting auto-reply reads as follows

Sorry, your message was not sent out to 'netops-centrex'
because the message is identical to a message you sent to 'info-access' recently.
If you believe you are receiving this message in error, 
resend your message with slightly different content.

Maybe I can overcome this intelligence by appending a unique line like email number 24183 to every message. Previously, I have sent the emails in the same order that WebScan Boten-Anna has collected them. This way, a single mail server might receive 50 messages from my email address in a very short time. In the future, I will use the Hashtable iterator in Java that adds randomness to the order of the emails.

For a moment, I was struck by the thought that some of the recipients with gmail-accounts would mark my message as spam, and that consequently Google would degrade hakenberg.org. However, hakenberg yields hakenberg.org on page two. In any case, I am willing to pay this price.

I have received no complaints. Instead, there was one encouraging personal reply

Hi Philipp,
Thank you for sharing the excerpt from Gonin's diary.  Indeed, this gesture
did restore hope and dignity, though initially one might not have expected
it to do so.
Best wishes,
Money does not perform. People do.
Dexia group

All around the world

I would like to send emails to people from as many nations in as many languages as possible. The number of emails sent to each country range from 7k to 40k.

Depending on the country, I choose a different name and subject for the email. For instance, the header to Mexico is

De: Jose Sanchez Rojas <pepe()hakenberg.org>
Asunto: Parte del diario del coronel Gonin
Nations supplied
Be sure you put your feet in the right place,
then stand firm.
Abraham Lincoln

Please respond to me via (do not reply to the email you have received).

Below, I reproduce the personal reviews in the order of their receipt. Empty lines, and contact information are removed. Unfortunately, I lost most of the replies that were addressed to any hakenberg.org mail address. If your message appears here, but you rather want it removed, a short note to me will do the job.

Hi Jan,
What a fascinating project - I just read all about it on your web page
after receiving one of your emails. Kind regards from Melbourne, Australia.
Was soll den der Scheiss?
Holen dich deine Mailaddi-Suchprogramm inzwischen ein oder was?
Danke für Deine Mail. Man fragt sich allerdings, was Du einem damit sagen möchtest.
Dear Jan Philipp Hakenberg,
Interesting that I should receive this from you.  I have recently  
been reading about Germany during and immediately following the war.   
I recently read W. G. Sebald's lectures on the air raids, Hans Erich  
Nossack's The End (in the recent English translation with photographs  
by Erich Andres), and the just-published New Yorker article by Günter  
Grass about his time in the war.  The Banksy manifesto is very  
disturbing.  I completely disagree with the judgement of the diarist  
and your email correspondent.  This does not tell us about the  
recovery of human individuality, but about the mindless process of  
war, the madness of humans so violently displaced from their everyday  
lives, the ineffectuality of bureaucracy, in short about many things  
except for what your diarist and correspondent tell us.
Brad [...]
I have read what you wrote on your site about the email, and you should not 
be so certain that you have right on your side.  The manifesto seems to be 
using these war time experiences as a justification for graffiti art in 
peacetime.  This is controversial and may not be what the original author 
would have wanted.  The manifesto is not really about promotion of peace, 
is it?  It is an artist promoting his/her own agenda.
Incidentally, I notice that you have your email address in image format on 
your website, presumably to avoid bulk email, but you are happy to send 
your own bulk email to email addresses found in plaintext format on other 
websites.  Is that not a bit hypocritical?
La vida,cualquier suspiro de ella, la enaltece...
Dear Jan Philipp Hakenberg,
I consider the message you sent me as spam. I understand from your 
website that the project is intended to send an email to each address 
only once, but I still think this is an unethical and intrusive 
procedure. I consider Banksy to be an opportunist passing off commercial 
art as political commentary, and thus degrading both art and politics. 
His use of the diary extract is sensationalist and shows a lack of 
consideration for the just treatment of the victims of the Holocaust. As 
the author of a book on Adorno, I feel Banksy could learn from the 
suggestion that after Auschwitz one should not write poetry.
Please do not reply to my message, as I do not wish to enter into a 
discussion with you about this.
Le agradezco mucho el texto que nos envía.
Estamos a sus órdenes.
Saludos cordiales,
Mtra. en Psic.  [...]
Directora del Centro Nacional de Diagnóstico
Para las Enfermedades Emocionales.
Pues, gracias por mandarme esto; es muy interesante, pero tengo que decir
Lic. Gareth [...]
Gracias, lo había leído en inglés...
Mi padrastro fue camarógrafo de combate y le toco entrar a liberar  
con la tropa varios campos...
Vaya historia! Perfecta para levantar el ánimo. Especialmente cuando uno
está abrumado de problemas y se entera de que hay alguien en peor
Muchas gracias. Siempre que caen en mis manos textos de ese periodo negro los 
leo con muchas dedicación. 
Teresita [...]
esta bastante interesante el libro de casualidad tienes el resto???
gracias por escribir
Good morning, you sent me an interesting article from the diary of 
Colonel Gonin.  I failed to place it please can you tell me more about
yourself and the context of this article.  I think it is very
interesting and I could not help my mind ticking around it.
I look forward to hear from you.  Thanks
Vimbai [...]
Please inform me why you have sent this to me at my place of 
employment, without any introduction or explanation. I read about the 
liberation of Bergen-Belsen while I was quite a small child and in the 
course of my lifetime have come to know concentration camp survivors 
in a number of countries, so the excerpt you have sent to me out of 
the blue is hardly new information. 
J. [...]
Creo que equivocaste el mail.... para que luego no esperes la respuesta que no 
Suerte la próxima
Ana María Avilés Muñoz
Qué lastima por usted y por todos los que sufrieron y se murieron, ni modo.
Me pareció muy interesante y conmovedora la historia.
¿Te conozco?
Interesting stuff - I've come across this before in Wall and Piece too!
Gracias mi estimado Pepe por email.
Recibe de mi parte un cordial saludito.
Gracias Dr. Sanchez, no sabes como te agradezco tus palabras.
Buenas tardes,
Agradecemos mucho tu información.
Estimado José:
    creo que no nos conocemos personalmente, pero esta semana me llegó 
un fragmento terrible de un Diario del Coronel Gonim y trae tu 
remitente. Por favor ¿podrías darme una referencia más completa para 
conseguir dicho diario? Desde ya muchas gracias. Marcelino [...]
¿ Puedo saber a que razón obedece este envio ?
Por otra parte, se lo agradezco. No por duro deja de ser inquietante e 
invita a la meditación.
Thank you for your email - I am presuming that you intended to submit
this extract to the Time Capsule.
At present the Time Capsule website is under redevelopment and will be
re-launched at the end of the summer.
Kind regards
indeed...from the website...some have suggested that banksy wrote it himself.
back from germany now, will see you soon!
noel and [...]
thanks for the interesting extract form the diary of Lieutenant Colonel 
Mervin Willet Gonin DSO who was among The first British soldiers to 
liberate Bergen-Belsen in 1945.
You have o send this extract to Schwabenakademie Irsee in German-
language if you want to participate in "10. Irseer Pegasus". You can 
read the conditions of participation by uschtrin.de 
I hope that the information can help you.
Yours sincerly
HAlloo Philipp!
Ich habe gerade unsere MAiladresse zum ersten Mal seit Ewigkeiten 
nachgeschaut und bin auf deine Mail gestoßen. Mir hat der Bericht 
sehr gut gefallen und ich bin ohnehin ein großer Fan von Banksy. 
Ich habe nur noch nicht ganz verstanden wo der Zusammenhang zur 
[...] Fachschaft besteht?? Kennen wir uns? Oder was hat es mit 
dieser Mail auf sich? Bring mal Licht ins dunkel...
Grüße, Marina
Verzeihung, Ihre Mail lässt mich etwas ratlos zurück.
S. [...]

Unfortunately, I lost some of the replies that were addressed to any hakenberg.org mail address.

One week after launch, my provider in Germany has sent me a kind request to stop. I am not being ironic - they have tolerated my activities for quite an amount of emails.