For Jay: "Tangerine" email

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

For Jay: "Tangerine" email

K Post
Jay,
ASSP doesn't have a bias built into it against any particular word

I believe your problem is that your bayesian or HMM database is inaccurate,
and probably too immature to be used if the appearance of a single word
causes a rejection.   - or the scoring and thresholds you've set isn't good.

A couple things I would do:

1) Go through the mail log, find the incorrectly rejected messages and
   a) Look at the log to see why they were rejected and
   b) copy them to the corrected not spam corpus to train ASSP that it was
mistaken
2) Consider assigning a negative score to the word that's causing the
problem in BombRe (a negative score, makes it double negative, so net
result is positive).  Even -20 should be enough to let the email slip
through.  That's a temporary fix until the corpus gets corrected
   -or-
Put this word in no processing for the time being



>
> Esteemed Colleagues:


> I have a recurring problem with ASSP: it discards important incoming

mail that contains the word "tangerine".  For example, if a client needs

me to fly somewhere in an emergency, and I reply, "where can I get a

tangerine airplane ticket?" or words to that effect, and the client

replies to my mail, including my mail in his reply, the reply is apt

to be discarded because it contains the word "tangerine", thus:


>  16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: [hidden email] Regex:BombRe 'PB
> 20: for tangerine'

 16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> [bombRe] 209.85.220.170 <...@gmail.com> to: [hidden email]  (bombRe
> 'tangerine')

 16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: [hidden email] Message-Score:
> added 20 for Regex:BombRe 'PB 20: for tangerine'  bombRe: 'tangerine',
> total score for this message is now 21

 16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> [bombRe] 209.85.220.170 <...@gmail.com> to: [hidden email] [spam
> found] (Regex:BombRe 'PB 20: for tangerine'  bombRe: 'tangerine')
> [{redacted}] -> /opt/assp/discarded/6726--6440.eml;

 16-10-20.maillog.txt:2016-10-20 21:39:09 m1-17548-06726 [Worker_1]
> 209.85.220.170 <...@gmail.com> to: [hidden email] [SMTP Error] 554
> 5.7.1 Delivery not authorized, message refused -- . (reason: Regex:BombRe
> 'PB 20: for tangerine'  bombRe: 'tangerine')


> Now, bombRe is a good idea, I suppose, but I should be able to control

it.  How do I do that?  The word "tangerine" does not appear anywhere in

assp.cfg.  In files/bombre.txt it appears only as "subject\: tangerineest"

and that would not cause mail to be discarded that contains "tangerine"

somewhere in its body.  tangerine (all capitals) also appears in

files/tlds-alpha-by-domain.txt but that too, if I am not mistaken, would

not cause mail to be discarded that contains "tangerine" somewhere in its

body.  It also appears several places in files/optRE/blackListedDomains.txt

-- e.g., "(?:quick)?usa|platform|tangerine|now|2u)\.biz" -- but that too,

if I am not mistaken, would not cause mail to be discarded that

contains "tangerine" somewhere in its body.


> It is possible, I suppose, that there is some utterly cryptic regular

expression in some file that matches "tangerine" without actually

containing the string "tangerine", but that would be utterly perverse and

I refuse to believe that the universe is that malicious.  And yet, my

e-mails are unquestionably being discarded.  How do I forever prevent

that from happening?


> Thank you in advance for any and all replies.  One more thing -- if

you do reply, please replace the word "tangerine" with the word

"tangerine", otherwise your reply to me is apt to be discarded.  Thank

you again.


>
>                         Jay F. Shachter

                        6424 N Whipple St

                        Chicago IL  60645-4111

                                (1-773)7613784   landline

                                (1-410)9964737   GoogleVoice

                                [hidden email]

                                http://m5.chicago.il.us


>                         "Quidquid latine dictum sit, altum videtur"

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user