ALT Linux Community general discussions
 help / color / mirror / Atom feed
* [Comm] Как обучить SpamAssassin?
@ 2006-01-19 16:49 unix9
  2006-01-20  7:59 ` Vladimir V. Kamarzin
  2006-01-20 17:25 ` Eugene Prokopiev
  0 siblings, 2 replies; 3+ messages in thread
From: unix9 @ 2006-01-19 16:49 UTC (permalink / raw)
  To: community

Приветствую всех!
Подскажите, как собственно вы настраиваете и потом обучаете SpamAssassin?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Comm] Как обучить SpamAssassin?
  2006-01-19 16:49 [Comm] Как обучить SpamAssassin? unix9
@ 2006-01-20  7:59 ` Vladimir V. Kamarzin
  2006-01-20 17:25 ` Eugene Prokopiev
  1 sibling, 0 replies; 3+ messages in thread
From: Vladimir V. Kamarzin @ 2006-01-20  7:59 UTC (permalink / raw)
  To: unix9; +Cc: ALT Linux Community

>>>>> On 19 Jan 2006 at 21:49 "u" == unix9  writes:

 u> Подскажите, как собственно вы настраиваете и потом обучаете SpamAssassin?

man sa-learn

-- 
vvk



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Comm] Как обучить SpamAssassin?
  2006-01-19 16:49 [Comm] Как обучить SpamAssassin? unix9
  2006-01-20  7:59 ` Vladimir V. Kamarzin
@ 2006-01-20 17:25 ` Eugene Prokopiev
  1 sibling, 0 replies; 3+ messages in thread
From: Eugene Prokopiev @ 2006-01-20 17:25 UTC (permalink / raw)
  To: unix9, ALT Linux Community

[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]

unix9 пишет:
> Приветствую всех!
> Подскажите, как собственно вы настраиваете и потом обучаете SpamAssassin?

А dspam попробовать не хотите? Преимущество перед SpamAssassin - это 
демон на С, интегрированный с ClamAV. Только что в бэкпорты отправилась 
(т.е. после ближайшей пересборки там появится) даже более свежая версия, 
нежели то, что есть в Сизифе - надеюсь, что со временем это исправится ;)

Способы обучения:

1) его собственный web-интерфейс (неопакеченный, но есть в архиве с 
исходниками)
2) утилита dspam. Например, чтобы сказать ей, что содержимое некоего 
mbox пользователь user@domain.com считает спамом, нужно скомандовать 
нечто вроде:

cat mbox | /usr/bin/dspam --class=spam --source=corpus  --user 
'user@domain.com' --mode=teft --feature=chained,noise

Я даже сваял простейший питоновский скрипт, который вытягивает 
содержимое папок с названием Spam у всех IMAP-пользователей DBMail и 
скармливает его таким образом dspam'у - см. аттач. Разумеется WITHOUT 
ANY WARRANTY :)

-- 
С уважением, Прокопьев Евгений

[-- Attachment #2: dspam-learn.py --]
[-- Type: text/plain, Size: 4478 bytes --]

#!/usr/bin/python

# Copyright (C) 2006 Eugene Prokopiev <enp at altlinux dot org>
#
# This program is free software; you can redistribute it and/or 
# modify it under the terms of the GNU General Public License 
# as published by the Free Software Foundation; either 
# version 2 of the License, or (at your option) any later 
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

import sys, os, getopt, email.Parser, re

def export(connection):
	
	parser = email.Parser.Parser()
	from_extractor = re.compile(r"[<>]")
	
	cursor_mailboxes = connection.cursor()
	cursor_messages = connection.cursor()
	cursor_messageblks = connection.cursor()
	cursor_modify = connection.cursor()
	
	sql_mailboxes = """select distinct alias from dbmail_aliases
		"""
	
	# alter table dbmail_messages add dspam_flag smallint not null default 0::smallint;
	
	sql_messages = """select message_idnr, internal_date
		from dbmail_aliases
		inner join dbmail_users on dbmail_aliases.deliver_to = dbmail_users.user_idnr
		inner join dbmail_mailboxes on dbmail_users.user_idnr = dbmail_mailboxes.owner_idnr
		inner join dbmail_messages on dbmail_mailboxes.mailbox_idnr = dbmail_messages.mailbox_idnr
		inner join dbmail_physmessage on dbmail_messages.physmessage_id = dbmail_physmessage.id
		where dbmail_messages.deleted_flag=0 and dbmail_messages.dspam_flag=0 
		and dbmail_mailboxes.name = 'Spam' and alias = %s
		"""
	
	sql_messageblks = """
		select messageblk, is_header from dbmail_messageblks 
		inner join dbmail_messages on dbmail_messageblks.physmessage_id = dbmail_messages.physmessage_id 
		where message_idnr = %s order by dbmail_messageblks.messageblk_idnr
		"""
	
	cursor_mailboxes.execute(sql_mailboxes)
	for alias in cursor_mailboxes.fetchall():
		count = 0
		mbox = os.popen(("/usr/bin/dspam --class=spam --source=corpus --user '%s' --mode=teft --feature=chained,noise" % alias), "w")
		#mbox = open(("%s" % alias), "w")
		cursor_messages.execute(sql_messages, alias)
		for message_idnr, internal_date in cursor_messages.fetchall():		
			count = count + 1
			cursor_messageblks.execute(sql_messageblks, (message_idnr,))
			for messageblk, is_header in cursor_messageblks.fetchall():
				if (is_header == 1):
					from_header = from_extractor.split(parser.parsestr(messageblk).get("From"))
					if (len(from_header) == 1):
						from_header = from_header[0]
					elif (len(from_header) == 3):
						from_header = from_header[1]
					else:
						from_header = "-"
					mbox.write("From "+from_header+" "+internal_date.strftime())
				mbox.write(messageblk)
		mbox.close()
		print "mailbox : %s	\t - spam messages : %s" % (("%s" % alias), count)
	
	cursor_modify.execute("update dbmail_messages set dspam_flag=1 where dspam_flag=0")
		
def usage():
	
	usage_text = """

dspam-learn - DBMail Spam mailboxes -> mbox -> DSPAM

arguments:

	-h|--help		- show this text
	-t|--type		- database driver type
	-s|--server		- server where DBMail database installed
	-d|--database	- DBMail database name
	-l|--login		- login to database
	-p|--password	- password to database
	
before using this script you need to run something like:

alter table dbmail_messages add dspam_flag smallint not null default 0::smallint;
	
"""
	
	print usage_text

def main(argv):
	
	type 		= "psycopg"
	server 		= "localhost"
	database	= "dbmail"
	login		= "dbmail"
	password 	= "dbmailpwd"
	
	try:
		opts, args = getopt.getopt(argv, "ht:s:d:l:p:", ["help", "type=", "server=", "database=", "login=", "password="])
	except getopt.GetoptError:
		usage()
		sys.exit(2)
		
	for opt, arg in opts:
		if opt in ("-h", "--help"):
			usage()
			sys.exit()
		elif opt in ("-t", "--type"):
			type = arg
		elif opt in ("-s", "--server"):
			server = arg
		elif opt in ("-d", "--database"):
			database = arg
		elif opt in ("-l", "--login"):
			login = arg
		elif opt in ("-p", "--password"):
			password = arg
			
	exec "import "+type+" as db"

	connection = db.connect("host="+server+" dbname="+database+" user="+login+" password="+password)
	export(connection)
	connection.commit()

if __name__ == "__main__":
    main(sys.argv[1:])

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-01-20 17:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-19 16:49 [Comm] Как обучить SpamAssassin? unix9
2006-01-20  7:59 ` Vladimir V. Kamarzin
2006-01-20 17:25 ` Eugene Prokopiev

ALT Linux Community general discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/community/0 community/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 community community/ http://lore.altlinux.org/community \
		mandrake-russian@linuxteam.iplabs.ru community@lists.altlinux.org community@lists.altlinux.ru community@lists.altlinux.com
	public-inbox-index community

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.community


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git