ALT Linux sysadmins discussion
 help / color / mirror / Atom feed
From: Eugene Prokopiev <prokopiev@stc.donpac.ru>
To: ALT Linux sysadmin discuss <sysadmins@lists.altlinux.org>
Subject: Re: [Sysadmins] dspam & postfix & mysql
Date: Tue, 23 Jan 2007 10:43:18 +0300
Message-ID: <45B5BC96.8010305@stc.donpac.ru> (raw)
In-Reply-To: <200701221741.31881.ashen@nsrz.ru>

[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]

Шенцев Алексей Владимирович пишет:
> Кто-нибудь настраивал сабж? Можете рассказать как и что у вас сделано?

Правильный путь : postfix -> dspam -> dbmail-lmtp, обмен по lmtp. В 2.0 
это не работало, а делать testcase и писать багрепорт было лень (сейчас 
осознаю, что был неправ), поэтому сделал : postfix -> dspam -> postfix 
-> dbmail-lmtp. Первые три компонента описаны везде, в т.ч., наверное, и 
в родной документации dspam, настройка dbmail-lmtpd такая же, как если 
бы dspam и не было.

Ну еще можно связать их через pipe, но в случае dbmail эта мысль не 
самая удачная ;)

Обработка спама: пользователь складывает в папку spam все, что считает 
таковым, скрипт (в аттаче - под 2.2 придется переписывать из-за 
изменений в структуре таблиц) выгребает из этой папки и обучает dspam. 
Нужно еще сделать аналогичную обработку для неспама, чтобы увеличить 
точность, но руки не доходят.

dspam в сизифе заброшен, даже в backports к 2.4 он выглядит чуть-чуть 
приличнее - хорошо бы, чтоб его подобрал кто-нибудь более 
заинтересованный, ты например :)

Кстати, как там со сборкой и опакечиванием web-морды к dbmail?

-- 
С уважением, Прокопьев Евгений

[-- Attachment #2: dspam-learn.py --]
[-- Type: text/plain, Size: 4500 bytes --]

#!/usr/bin/python

# Copyright (C) 2006 Eugene Prokopiev <enp at altlinux dot org>
#
# This program is free software; you can redistribute it and/or 
# modify it under the terms of the GNU General Public License 
# as published by the Free Software Foundation; either 
# version 2 of the License, or (at your option) any later 
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

import sys, os, getopt, email.Parser, re

def export(connection):
	
	parser = email.Parser.Parser()
	from_extractor = re.compile(r"[<>]")
	
	cursor_mailboxes = connection.cursor()
	cursor_messages = connection.cursor()
	cursor_messageblks = connection.cursor()
	cursor_modify = connection.cursor()
	
	sql_mailboxes = """select distinct alias from dbmail_aliases
		"""
	
	# alter table dbmail_messages add dspam_flag smallint not null default 0::smallint;
	
	sql_messages = """select message_idnr, internal_date
		from dbmail_aliases
		inner join dbmail_users on dbmail_aliases.deliver_to = dbmail_users.user_idnr
		inner join dbmail_mailboxes on dbmail_users.user_idnr = dbmail_mailboxes.owner_idnr
		inner join dbmail_messages on dbmail_mailboxes.mailbox_idnr = dbmail_messages.mailbox_idnr
		inner join dbmail_physmessage on dbmail_messages.physmessage_id = dbmail_physmessage.id
		where dbmail_messages.dspam_flag=0 and dbmail_messages.status < 2 and dbmail_mailboxes.name = 'Spam' 
		and dbmail_aliases.alias = %s order by message_idnr
		"""
	
	sql_messageblks = """
		select messageblk, is_header from dbmail_messageblks 
		inner join dbmail_messages on dbmail_messageblks.physmessage_id = dbmail_messages.physmessage_id 
		where message_idnr = %s order by dbmail_messageblks.messageblk_idnr
		"""
	
	cursor_mailboxes.execute(sql_mailboxes)
	for alias in cursor_mailboxes.fetchall():
		print "Processing mailbox %s ..." % ("%s" % alias)
		count = 0
		cursor_messages.execute(sql_messages, alias)
		for message_idnr, internal_date in cursor_messages.fetchall():	
			count = count + 1
			dspam = os.popen(("/usr/bin/dspam --class=spam --source=error --user '%s' --mode=teft --feature=chained,noise" % alias), "w")
			cursor_messageblks.execute(sql_messageblks, (message_idnr,))
			for messageblk, is_header in cursor_messageblks.fetchall():
				if (is_header == 1):
					from_header = from_extractor.split(parser.parsestr(messageblk).get("From"))
					if (len(from_header) == 1):
						from_header = from_header[0]
					elif (len(from_header) == 3):
						from_header = from_header[1]
					else:
						from_header = "-"
					print "Processing message from %s ..." % from_header
					dspam.write("From "+from_header+" "+internal_date.strftime()+"\n")
				dspam.write(messageblk+"\n")
			dspam.close()
		print "Processed mailbox %s with %s spam messages\n" % (("%s" % alias), count)
	cursor_modify.execute("update dbmail_messages set dspam_flag=1 where dspam_flag=0")
		
def usage():
	
	usage_text = """

dspam-learn - DBMail Spam mailboxes -> mbox -> DSPAM

arguments:

	-h|--help		- show this text
	-t|--type		- database driver type
	-s|--server		- server where DBMail database installed
	-d|--database	- DBMail database name
	-l|--login		- login to database
	-p|--password	- password to database
	
"""
	
	print usage_text

def main(argv):
	
	type 		= "psycopg"
	server 		= "localhost"
	database	= "dbmail"
	login		= "dbmail"
	password 	= "dbmailpwd"
	
	try:
		opts, args = getopt.getopt(argv, "ht:s:d:l:p:", ["help", "type=", "server=", "database=", "login=", "password="])
	except getopt.GetoptError:
		usage()
		sys.exit(2)
		
	for opt, arg in opts:
		if opt in ("-h", "--help"):
			usage()
			sys.exit()
		elif opt in ("-t", "--type"):
			type = arg
		elif opt in ("-s", "--server"):
			server = arg
		elif opt in ("-d", "--database"):
			database = arg
		elif opt in ("-l", "--login"):
			login = arg
		elif opt in ("-p", "--password"):
			password = arg
			
	exec "import "+type+" as db"

	connection = db.connect("host="+server+" dbname="+database+" user="+login+" password="+password)
	connection.set_isolation_level(1);
	export(connection)
	connection.commit()

if __name__ == "__main__":
    main(sys.argv[1:])

  parent reply	other threads:[~2007-01-23  7:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-22 14:41 Шенцев Алексей Владимирович
2007-01-22 17:04   ` Grigory Fateyev
2007-01-22 23:58     ` altlinux
2007-01-23  6:05       ` Шенцев Алексей Владимирович
2007-01-22 23:04   ` Sergey V Kovalyov
2007-01-23  7:43 ` Eugene Prokopiev [this message]
2007-01-23  8:13   ` Шенцев Алексей Владимирович
2007-01-23 10:50     ` Eugene Prokopiev
2007-01-23 11:09       ` Шенцев Алексей Владимирович
2007-01-23 11:28         ` Eugene Prokopiev
2007-01-23 11:43           ` Шенцев Алексей Владимирович
2007-01-23 11:23       ` Шенцев Алексей Владимирович

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45B5BC96.8010305@stc.donpac.ru \
    --to=prokopiev@stc.donpac.ru \
    --cc=sysadmins@lists.altlinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

ALT Linux sysadmins discussion

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/sysadmins/0 sysadmins/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 sysadmins sysadmins/ http://lore.altlinux.org/sysadmins \
		sysadmins@lists.altlinux.org sysadmins@lists.altlinux.ru sysadmins@lists.altlinux.com
	public-inbox-index sysadmins

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.sysadmins


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git