[Listwork] mailman full raw archive mboxes are NOT anti-spammed :(

boud boud1 at wp.pl
Wed Feb 18 13:43:37 PST 2004


hi listwork,

SPAM PROBLEM:
HYPOTHESIS
SOLUTION 2.1.4

HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1)
(1) remove link to *.mbox in /var/lib/mailman/Mailman/Archiver/HyperArch.py
(2) remove compiled versions HyperArch.py[oc]
(3) test on an archive (e.g. imc-pl-tech ;)
(4) make the *.mbox/*.mbox files inaccessible (since google links remain)
(5) think about newlist s (maybe nothing needs doing)


SPAM PROBLEM:
------------
  piotr from imc-pl-tech noticed a spammability bug in mailman.

  It looks like the archive files linked under "download the full raw archive"
like the following etc are NOT antispammed:

http:// lists indymedia org/mailman/public/listwork.mbox/listwork.mbox

http:// lists indymedia org/pipermail/imc-pl-tech.mbox/imc-pl-tech.mbox

http:// lists indymedia org/pipermail/imc-pl.mbox/imc-pl.mbox

[i've put spaces " " in the above URLs to avoid giving google an extra link 
to this extremely rich spammable email resources...]

They have addresses in the  user at machine.domain
format instead of  user at machine.domain.


This seems to me to be a bug in mailman. 

It seems to be present in 2.0.11 and 2.1.1 


HYPOTHESIS
----------
IMHO the reason why this is probably not easy to solve is that this
is where mail is automatically saved when it's received. If this is 
filtered by " at " -> "@" then it means that overall there are typically 4
copies of the entire mailbox (e.g. html version, monthly archives, 
true mailbox with @ hidden from external access, and " at " version 
for web access).

i couldn't find if this has been discussed, but it looks like there's a 
simple solution in 2.1.4.


SOLUTION 2.1.4
--------------
It looks like the solution in mailman-2.1.4 is to offer different templates:

mailman-2.1.4/templates/en/archtoc.html
mailman-2.1.4/templates/en/archtocentry.html
mailman-2.1.4/templates/en/archtocnombox.html   ->  this one has no mbox

e.g.
http://mail.python.org/pipermail/mailman-announce/  has no *.mbox/*.mbox
In fact, 
http://mail.python.org/pipermail/mailman-announce.mbox/
exists but nothing in it is accessible.

HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1)
----------------------------------------

In 2.0.11, the line pointing to the .mbox is in 

/var/lib/mailman/Mailman/Archiver/HyperArch.py  (for Debian anyway ;)

      You can get <a href="%(listinfo)s">more information about this list</a>
      or you can <a href="%(fullarch)s">download the full raw archive</a>
      (%(size)s).
     </p>

solution:

(1) remove the link to the full .mbox  in
/var/lib/mailman/Mailman/Archiver/HyperArch.py

To do this, 

replace

      You can get <a href="%(listinfo)s">more information about this list</a>
      or you can <a href="%(fullarch)s">download the full raw archive</a>
      (%(size)s).
     </p>

by

      You can get <a href="%(listinfo)s">more information about this list.</a>
     </p>


(2) remove compiled versions (in my case the .pyc gets automatically recompiled)

rm /var/lib/mailman/Mailman/Archiver/HyperArch.py[co]

(3) test this

cd /var/lib/mailman/archives/public/
/usr/lib/mailman/bin/arch imc-pl-tech

Then check out:
http://lists.indymedia.org/pipermail/imc-pl-tech/

Hopefully there will be no link to the .mbox and even direct access will
be impossible.

(4) make the .mbox files inaccessible - since google links will still 
hang around for some time
chmod go-rw /var/lib/mailman/archives/private/*.mbox/*.mbox

(5) probably nothing needed when running   newlist  for new lists.

Making new lists will, by default, write *.mbox/*.mbox which are web
accessible, but nobody is going to link to them (unless paranoid,
deliberately wants to subject indymedia users to spam, ...)


anti-spam solidarity
boud (IMC PL)









More information about the listwork mailing list