HMM-Check has given less than 6 results - using monitoring mode only

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
Recently, it seems my HMM and Bayes checks are no longer working?  In
mail log, I see:

"HMM-Check has given less than 6 results - using monitoring mode only"

I'll include my latest rebuildrun.txt, which looks like it ran successfully.

Why is this happening?  I'm running ASSP 2.4.7(16004).  Also, it seems
like if I get this error, it doesn't even perform Bayesian scoring --
basically, spam that was previously being blocked is now being let
through...


---rebuildrun.txt---

Jan-28-16 09:05:00 RebuildSpamDB-thread rebuildspamdb-version 7.26
started in ASSP version 2.4.7(16004)

Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB for temporary hashes

Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB-ENV with 62.50 MByte

Jan-28-16 09:05:00 RebuildSpamDB will create a Hidden Markov Model

Jan-28-16 09:05:00 RebuildSpamDB will create unicode enabled databases

Jan-28-16 09:05:00 RebuildSpamDB will process all words as Sequence of
UAX #29 Grapheme Clusters

Jan-28-16 09:05:00 RebuildSpamDB will normalize unicode characters

Jan-28-16 09:05:00 RebuildSpamDB will use the ASSP_WordStem engine

Jan-28-16 09:05:00 ---ASSP Settings---
Jan-28-16 09:05:00 Do Not Collect Messages with RedListed address: Enabled
**Messages with RedListed addresses will be removed from the corpus!**

Jan-28-16 09:05:00 Do Not Collect RedRe Messages: Enabled
**Messages matching the RedRe will be removed from the corpus!**

Jan-28-16 09:05:00 Use Subject as Maillog Names: True
Jan-28-16 09:05:00 Maxbytes: 4,000
Jan-28-16 09:05:00 RebuildFileTimeLimit: 1 5
Jan-28-16 09:05:00 RebuildFileTimeLimit: files will be moved away from
the corpus if their processing takes longer than 5 second(s)

Jan-28-16 09:05:00 /data/assp/errors/spam
Jan-28-16 09:05:00 File Count:  11
Jan-28-16 09:05:00 Processing... errors/spam with 11 files
Jan-28-16 09:05:00 ignore and remove files older than Sep-11-88 10:05:00
in folder errors/spam
Jan-28-16 09:05:01 Imported Files for HeloBlackList:    10
Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        10
Jan-28-16 09:05:01 Finished in 1 second(s)

Jan-28-16 09:05:01 /data/assp/errors/notspam
Jan-28-16 09:05:01 File Count:  1
Jan-28-16 09:05:01 Processing... errors/notspam with 1 files
Jan-28-16 09:05:01 ignore and remove files older than Sep-11-88 10:05:01
in folder errors/notspam
Jan-28-16 09:05:01 Imported Files for HeloBlackList:    0
Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        0
Jan-28-16 09:05:01 Finished in 1 second(s)
Jan-28-16 09:05:01 info: corpusnorm after processing errors/spam and
errors/notspam is Spam Weight: 8280 / Not-Spam Weight: 0 => norm: 10.000
Jan-28-16 09:05:01 info: require approx. 6,726 files (3,255,584 words)
from folder spam to get the wanted corpusnorm (1.000)

Jan-28-16 09:05:01 /data/assp/spam
Jan-28-16 09:05:01 File Count:  11,195
Jan-28-16 09:05:01 Processing... spam with 11,195 files
Jan-28-16 09:05:01 ignore and remove files older than Dec-28-15 09:05:01
in folder spam
Jan-28-16 09:15:31 Removed Old: 5
Jan-28-16 09:15:31 Imported Files for HeloBlackList:    11,190
Jan-28-16 09:15:31 Imported Files for Bayes/HMM:        6,672
Jan-28-16 09:15:31 Finished in 630 second(s)
Jan-28-16 09:15:31 info: require approx. all files (3,264,527 words)
from folder notspam to get the wanted corpusnorm (1.000)

Jan-28-16 09:15:31 /data/assp/notspam
Jan-28-16 09:15:31 File Count:  7,009
Jan-28-16 09:15:31 Processing... notspam with 7,009 files
Jan-28-16 09:15:31 ignore and remove files older than Dec-28-15 09:15:31
in folder notspam
Jan-28-16 09:25:53 Removed Old: 7
Jan-28-16 09:25:53 Imported Files for HeloBlackList:    7,002
Jan-28-16 09:25:53 Imported Files for Bayes/HMM:        6,992
Jan-28-16 09:25:53 Finished in 622 second(s)

Jan-28-16 09:25:53 Generating weighted Bayesian tuplets
Jan-28-16 09:26:10 populating Spamdb 503166 records - Bayesian check is
now disabled
Jan-28-16 09:26:23 done - populating Spamdb records - 503166 - Bayesian
check is now enabled
Jan-28-16 09:26:23 done - Generating weighted Bayesian tuplets

Jan-28-16 09:26:23 Bayesian Pairs: 503,166 now in list

Jan-28-16 09:26:23 Generating consolidated Hidden-Markov-Model database
from 3,772,337 record model
Jan-28-16 09:28:22 HMM sequences: 1,848,357 now in list

Jan-28-16 09:28:22 generating Spamdb.helo records from 7,502 collected
HELO's
Jan-28-16 09:28:22 cleaning old Spamdb.helo records
Jan-28-16 09:28:22 done - cleaning old Spamdb.helo records

Jan-28-16 09:28:22 HELO Blacklist: 4 new, 0 now in list

Jan-28-16 09:28:22 Spam Weight    :   3,264,527
Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258

Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
Jan-28-16 09:28:22 Corpus confidence:   0.06250000

Jan-28-16 09:28:27 Start populating Hidden Markov Model. HMM-check is
disabled for this time!
Jan-28-16 09:28:27 start populating Hidden Markov Model with 1,848,357
records!
Jan-28-16 09:28:59 Finished populating Hidden Markov Model with
1,848,357 records!
Jan-28-16 09:28:59 Finished populating Hidden Markov Model. HMM-check is
now enabled again!

Jan-28-16 09:28:59 Total processing time: 1,439 second(s)

Jan-28-16 09:28:59 Total processing data: 118.85 MByte


Jan-28-16 09:28:59 Rebuild processed 14.52 files per second.

Jan-28-16 09:28:59 After finishing the Rebuild process, the
/data/assp/tmpDB folder contains 791.45 MByte.

Jan-28-16 09:28:59 After finishing the Rebuild process, the drive that
contains the /data/assp/tmpDB folder has 1.22 GByte free space from
total 1.90 GByte.

Jan-28-16 09:28:59 building new GripList records and bounce report
Jan-28-16 09:28:59 processing Logfile /data/assp/logs/maillog.txt
Jan-28-16 09:28:59 processing Logfile /data/assp/logs/16-01-27.maillog.txt
Jan-28-16 09:29:01 processing Logfile /data/assp/logs/16-01-26.maillog.txt
Jan-28-16 09:29:02 processing Logfile /data/assp/logs/16-01-25.maillog.txt
Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-24.maillog.txt
Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-23.maillog.txt

Jan-28-16 09:29:03 skipping bounce report because 'DoNotCollectBounces'
is switched ON

Jan-28-16 09:29:03 Uploading Griplist via Direct Connection
Jan-28-16 09:29:04 Submitted 6,924 bytes: 0 IPv6 addresses, 768 IPv4
addresses

Jan-28-16 09:29:04 Trashlist was saved to /data/assp/trashlist.db

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
Also: I have BayesAfterHMM blank, and the log doesn't show any Bayes
scoring happening (I have DoHMM and DoBayesian both set to "score").


On 1/28/16 10:48 AM, Dossy Shiobara wrote:

> Recently, it seems my HMM and Bayes checks are no longer working?  In
> mail log, I see:
>
> "HMM-Check has given less than 6 results - using monitoring mode only"
>
> I'll include my latest rebuildrun.txt, which looks like it ran successfully.
>
> Why is this happening?  I'm running ASSP 2.4.7(16004).  Also, it seems
> like if I get this error, it doesn't even perform Bayesian scoring --
> basically, spam that was previously being blocked is now being let
> through...
>
>
> ---rebuildrun.txt---
>
> Jan-28-16 09:05:00 RebuildSpamDB-thread rebuildspamdb-version 7.26
> started in ASSP version 2.4.7(16004)
>
> Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB for temporary hashes
>
> Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB-ENV with 62.50 MByte
>
> Jan-28-16 09:05:00 RebuildSpamDB will create a Hidden Markov Model
>
> Jan-28-16 09:05:00 RebuildSpamDB will create unicode enabled databases
>
> Jan-28-16 09:05:00 RebuildSpamDB will process all words as Sequence of
> UAX #29 Grapheme Clusters
>
> Jan-28-16 09:05:00 RebuildSpamDB will normalize unicode characters
>
> Jan-28-16 09:05:00 RebuildSpamDB will use the ASSP_WordStem engine
>
> Jan-28-16 09:05:00 ---ASSP Settings---
> Jan-28-16 09:05:00 Do Not Collect Messages with RedListed address: Enabled
> **Messages with RedListed addresses will be removed from the corpus!**
>
> Jan-28-16 09:05:00 Do Not Collect RedRe Messages: Enabled
> **Messages matching the RedRe will be removed from the corpus!**
>
> Jan-28-16 09:05:00 Use Subject as Maillog Names: True
> Jan-28-16 09:05:00 Maxbytes: 4,000
> Jan-28-16 09:05:00 RebuildFileTimeLimit: 1 5
> Jan-28-16 09:05:00 RebuildFileTimeLimit: files will be moved away from
> the corpus if their processing takes longer than 5 second(s)
>
> Jan-28-16 09:05:00 /data/assp/errors/spam
> Jan-28-16 09:05:00 File Count:  11
> Jan-28-16 09:05:00 Processing... errors/spam with 11 files
> Jan-28-16 09:05:00 ignore and remove files older than Sep-11-88 10:05:00
> in folder errors/spam
> Jan-28-16 09:05:01 Imported Files for HeloBlackList:    10
> Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        10
> Jan-28-16 09:05:01 Finished in 1 second(s)
>
> Jan-28-16 09:05:01 /data/assp/errors/notspam
> Jan-28-16 09:05:01 File Count:  1
> Jan-28-16 09:05:01 Processing... errors/notspam with 1 files
> Jan-28-16 09:05:01 ignore and remove files older than Sep-11-88 10:05:01
> in folder errors/notspam
> Jan-28-16 09:05:01 Imported Files for HeloBlackList:    0
> Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        0
> Jan-28-16 09:05:01 Finished in 1 second(s)
> Jan-28-16 09:05:01 info: corpusnorm after processing errors/spam and
> errors/notspam is Spam Weight: 8280 / Not-Spam Weight: 0 => norm: 10.000
> Jan-28-16 09:05:01 info: require approx. 6,726 files (3,255,584 words)
> from folder spam to get the wanted corpusnorm (1.000)
>
> Jan-28-16 09:05:01 /data/assp/spam
> Jan-28-16 09:05:01 File Count:  11,195
> Jan-28-16 09:05:01 Processing... spam with 11,195 files
> Jan-28-16 09:05:01 ignore and remove files older than Dec-28-15 09:05:01
> in folder spam
> Jan-28-16 09:15:31 Removed Old: 5
> Jan-28-16 09:15:31 Imported Files for HeloBlackList:    11,190
> Jan-28-16 09:15:31 Imported Files for Bayes/HMM:        6,672
> Jan-28-16 09:15:31 Finished in 630 second(s)
> Jan-28-16 09:15:31 info: require approx. all files (3,264,527 words)
> from folder notspam to get the wanted corpusnorm (1.000)
>
> Jan-28-16 09:15:31 /data/assp/notspam
> Jan-28-16 09:15:31 File Count:  7,009
> Jan-28-16 09:15:31 Processing... notspam with 7,009 files
> Jan-28-16 09:15:31 ignore and remove files older than Dec-28-15 09:15:31
> in folder notspam
> Jan-28-16 09:25:53 Removed Old: 7
> Jan-28-16 09:25:53 Imported Files for HeloBlackList:    7,002
> Jan-28-16 09:25:53 Imported Files for Bayes/HMM:        6,992
> Jan-28-16 09:25:53 Finished in 622 second(s)
>
> Jan-28-16 09:25:53 Generating weighted Bayesian tuplets
> Jan-28-16 09:26:10 populating Spamdb 503166 records - Bayesian check is
> now disabled
> Jan-28-16 09:26:23 done - populating Spamdb records - 503166 - Bayesian
> check is now enabled
> Jan-28-16 09:26:23 done - Generating weighted Bayesian tuplets
>
> Jan-28-16 09:26:23 Bayesian Pairs: 503,166 now in list
>
> Jan-28-16 09:26:23 Generating consolidated Hidden-Markov-Model database
> from 3,772,337 record model
> Jan-28-16 09:28:22 HMM sequences: 1,848,357 now in list
>
> Jan-28-16 09:28:22 generating Spamdb.helo records from 7,502 collected
> HELO's
> Jan-28-16 09:28:22 cleaning old Spamdb.helo records
> Jan-28-16 09:28:22 done - cleaning old Spamdb.helo records
>
> Jan-28-16 09:28:22 HELO Blacklist: 4 new, 0 now in list
>
> Jan-28-16 09:28:22 Spam Weight    :   3,264,527
> Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258
>
> Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
> Jan-28-16 09:28:22 Corpus confidence:   0.06250000
>
> Jan-28-16 09:28:27 Start populating Hidden Markov Model. HMM-check is
> disabled for this time!
> Jan-28-16 09:28:27 start populating Hidden Markov Model with 1,848,357
> records!
> Jan-28-16 09:28:59 Finished populating Hidden Markov Model with
> 1,848,357 records!
> Jan-28-16 09:28:59 Finished populating Hidden Markov Model. HMM-check is
> now enabled again!
>
> Jan-28-16 09:28:59 Total processing time: 1,439 second(s)
>
> Jan-28-16 09:28:59 Total processing data: 118.85 MByte
>
>
> Jan-28-16 09:28:59 Rebuild processed 14.52 files per second.
>
> Jan-28-16 09:28:59 After finishing the Rebuild process, the
> /data/assp/tmpDB folder contains 791.45 MByte.
>
> Jan-28-16 09:28:59 After finishing the Rebuild process, the drive that
> contains the /data/assp/tmpDB folder has 1.22 GByte free space from
> total 1.90 GByte.
>
> Jan-28-16 09:28:59 building new GripList records and bounce report
> Jan-28-16 09:28:59 processing Logfile /data/assp/logs/maillog.txt
> Jan-28-16 09:28:59 processing Logfile /data/assp/logs/16-01-27.maillog.txt
> Jan-28-16 09:29:01 processing Logfile /data/assp/logs/16-01-26.maillog.txt
> Jan-28-16 09:29:02 processing Logfile /data/assp/logs/16-01-25.maillog.txt
> Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-24.maillog.txt
> Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-23.maillog.txt
>
> Jan-28-16 09:29:03 skipping bounce report because 'DoNotCollectBounces'
> is switched ON
>
> Jan-28-16 09:29:03 Uploading Griplist via Direct Connection
> Jan-28-16 09:29:04 Submitted 6,924 bytes: 0 IPv6 addresses, 768 IPv4
> addresses
>
> Jan-28-16 09:29:04 Trashlist was saved to /data/assp/trashlist.db
>

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Alexandre de Arruda Paes
Hi,

If you use a database (like mysql), search in maillog if this records was
tranfered correctly after the rebuilddb terminate.
Here, if this occurs, the message is the same as yours.

2016-01-28 13:54 GMT-02:00 Dossy Shiobara <[hidden email]>:

> Also: I have BayesAfterHMM blank, and the log doesn't show any Bayes
> scoring happening (I have DoHMM and DoBayesian both set to "score").
>
>
> On 1/28/16 10:48 AM, Dossy Shiobara wrote:
> > Recently, it seems my HMM and Bayes checks are no longer working?  In
> > mail log, I see:
> >
> > "HMM-Check has given less than 6 results - using monitoring mode only"
> >
> > I'll include my latest rebuildrun.txt, which looks like it ran
> successfully.
> >
> > Why is this happening?  I'm running ASSP 2.4.7(16004).  Also, it seems
> > like if I get this error, it doesn't even perform Bayesian scoring --
> > basically, spam that was previously being blocked is now being let
> > through...
> >
> >
> > ---rebuildrun.txt---
> >
> > Jan-28-16 09:05:00 RebuildSpamDB-thread rebuildspamdb-version 7.26
> > started in ASSP version 2.4.7(16004)
> >
> > Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB for temporary hashes
> >
> > Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB-ENV with 62.50 MByte
> >
> > Jan-28-16 09:05:00 RebuildSpamDB will create a Hidden Markov Model
> >
> > Jan-28-16 09:05:00 RebuildSpamDB will create unicode enabled databases
> >
> > Jan-28-16 09:05:00 RebuildSpamDB will process all words as Sequence of
> > UAX #29 Grapheme Clusters
> >
> > Jan-28-16 09:05:00 RebuildSpamDB will normalize unicode characters
> >
> > Jan-28-16 09:05:00 RebuildSpamDB will use the ASSP_WordStem engine
> >
> > Jan-28-16 09:05:00 ---ASSP Settings---
> > Jan-28-16 09:05:00 Do Not Collect Messages with RedListed address:
> Enabled
> > **Messages with RedListed addresses will be removed from the corpus!**
> >
> > Jan-28-16 09:05:00 Do Not Collect RedRe Messages: Enabled
> > **Messages matching the RedRe will be removed from the corpus!**
> >
> > Jan-28-16 09:05:00 Use Subject as Maillog Names: True
> > Jan-28-16 09:05:00 Maxbytes: 4,000
> > Jan-28-16 09:05:00 RebuildFileTimeLimit: 1 5
> > Jan-28-16 09:05:00 RebuildFileTimeLimit: files will be moved away from
> > the corpus if their processing takes longer than 5 second(s)
> >
> > Jan-28-16 09:05:00 /data/assp/errors/spam
> > Jan-28-16 09:05:00 File Count:  11
> > Jan-28-16 09:05:00 Processing... errors/spam with 11 files
> > Jan-28-16 09:05:00 ignore and remove files older than Sep-11-88 10:05:00
> > in folder errors/spam
> > Jan-28-16 09:05:01 Imported Files for HeloBlackList:    10
> > Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        10
> > Jan-28-16 09:05:01 Finished in 1 second(s)
> >
> > Jan-28-16 09:05:01 /data/assp/errors/notspam
> > Jan-28-16 09:05:01 File Count:  1
> > Jan-28-16 09:05:01 Processing... errors/notspam with 1 files
> > Jan-28-16 09:05:01 ignore and remove files older than Sep-11-88 10:05:01
> > in folder errors/notspam
> > Jan-28-16 09:05:01 Imported Files for HeloBlackList:    0
> > Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        0
> > Jan-28-16 09:05:01 Finished in 1 second(s)
> > Jan-28-16 09:05:01 info: corpusnorm after processing errors/spam and
> > errors/notspam is Spam Weight: 8280 / Not-Spam Weight: 0 => norm: 10.000
> > Jan-28-16 09:05:01 info: require approx. 6,726 files (3,255,584 words)
> > from folder spam to get the wanted corpusnorm (1.000)
> >
> > Jan-28-16 09:05:01 /data/assp/spam
> > Jan-28-16 09:05:01 File Count:  11,195
> > Jan-28-16 09:05:01 Processing... spam with 11,195 files
> > Jan-28-16 09:05:01 ignore and remove files older than Dec-28-15 09:05:01
> > in folder spam
> > Jan-28-16 09:15:31 Removed Old: 5
> > Jan-28-16 09:15:31 Imported Files for HeloBlackList:    11,190
> > Jan-28-16 09:15:31 Imported Files for Bayes/HMM:        6,672
> > Jan-28-16 09:15:31 Finished in 630 second(s)
> > Jan-28-16 09:15:31 info: require approx. all files (3,264,527 words)
> > from folder notspam to get the wanted corpusnorm (1.000)
> >
> > Jan-28-16 09:15:31 /data/assp/notspam
> > Jan-28-16 09:15:31 File Count:  7,009
> > Jan-28-16 09:15:31 Processing... notspam with 7,009 files
> > Jan-28-16 09:15:31 ignore and remove files older than Dec-28-15 09:15:31
> > in folder notspam
> > Jan-28-16 09:25:53 Removed Old: 7
> > Jan-28-16 09:25:53 Imported Files for HeloBlackList:    7,002
> > Jan-28-16 09:25:53 Imported Files for Bayes/HMM:        6,992
> > Jan-28-16 09:25:53 Finished in 622 second(s)
> >
> > Jan-28-16 09:25:53 Generating weighted Bayesian tuplets
> > Jan-28-16 09:26:10 populating Spamdb 503166 records - Bayesian check is
> > now disabled
> > Jan-28-16 09:26:23 done - populating Spamdb records - 503166 - Bayesian
> > check is now enabled
> > Jan-28-16 09:26:23 done - Generating weighted Bayesian tuplets
> >
> > Jan-28-16 09:26:23 Bayesian Pairs: 503,166 now in list
> >
> > Jan-28-16 09:26:23 Generating consolidated Hidden-Markov-Model database
> > from 3,772,337 record model
> > Jan-28-16 09:28:22 HMM sequences: 1,848,357 now in list
> >
> > Jan-28-16 09:28:22 generating Spamdb.helo records from 7,502 collected
> > HELO's
> > Jan-28-16 09:28:22 cleaning old Spamdb.helo records
> > Jan-28-16 09:28:22 done - cleaning old Spamdb.helo records
> >
> > Jan-28-16 09:28:22 HELO Blacklist: 4 new, 0 now in list
> >
> > Jan-28-16 09:28:22 Spam Weight    :   3,264,527
> > Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258
> >
> > Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
> > Jan-28-16 09:28:22 Corpus confidence:   0.06250000
> >
> > Jan-28-16 09:28:27 Start populating Hidden Markov Model. HMM-check is
> > disabled for this time!
> > Jan-28-16 09:28:27 start populating Hidden Markov Model with 1,848,357
> > records!
> > Jan-28-16 09:28:59 Finished populating Hidden Markov Model with
> > 1,848,357 records!
> > Jan-28-16 09:28:59 Finished populating Hidden Markov Model. HMM-check is
> > now enabled again!
> >
> > Jan-28-16 09:28:59 Total processing time: 1,439 second(s)
> >
> > Jan-28-16 09:28:59 Total processing data: 118.85 MByte
> >
> >
> > Jan-28-16 09:28:59 Rebuild processed 14.52 files per second.
> >
> > Jan-28-16 09:28:59 After finishing the Rebuild process, the
> > /data/assp/tmpDB folder contains 791.45 MByte.
> >
> > Jan-28-16 09:28:59 After finishing the Rebuild process, the drive that
> > contains the /data/assp/tmpDB folder has 1.22 GByte free space from
> > total 1.90 GByte.
> >
> > Jan-28-16 09:28:59 building new GripList records and bounce report
> > Jan-28-16 09:28:59 processing Logfile /data/assp/logs/maillog.txt
> > Jan-28-16 09:28:59 processing Logfile
> /data/assp/logs/16-01-27.maillog.txt
> > Jan-28-16 09:29:01 processing Logfile
> /data/assp/logs/16-01-26.maillog.txt
> > Jan-28-16 09:29:02 processing Logfile
> /data/assp/logs/16-01-25.maillog.txt
> > Jan-28-16 09:29:03 processing Logfile
> /data/assp/logs/16-01-24.maillog.txt
> > Jan-28-16 09:29:03 processing Logfile
> /data/assp/logs/16-01-23.maillog.txt
> >
> > Jan-28-16 09:29:03 skipping bounce report because 'DoNotCollectBounces'
> > is switched ON
> >
> > Jan-28-16 09:29:03 Uploading Griplist via Direct Connection
> > Jan-28-16 09:29:04 Submitted 6,924 bytes: 0 IPv6 addresses, 768 IPv4
> > addresses
> >
> > Jan-28-16 09:29:04 Trashlist was saved to /data/assp/trashlist.db
> >
>
> --
> Dossy Shiobara         |      "He realized the fastest way to change
> [hidden email]     |   is to laugh at your own folly -- then you
> http://panoptic.com/   |   can let go and quickly move on." (p. 70)
>   * WordPress * jQuery * MySQL * Security * Business Continuity *
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Assp-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-user
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
I am using BerkeleyDB.  What does the log message string look like if it
was transferred correctly so I can search for it?


On 1/28/16 5:30 PM, Alexandre de Arruda Paes wrote:
> If you use a database (like mysql), search in maillog if this records was
> tranfered correctly after the rebuilddb terminate.
> Here, if this occurs, the message is the same as yours.

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Alexandre de Arruda Paes
I don't know if in BerkeleyDB the result is the same, but see my log bellow.


# grep Worker_10001 maillog.txt


jan-30-16 02:42:53 [Worker_10001] Try to lock HMM databases in 5 second(s)
jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model.
HMM-check is disabled for this time!
jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model with
1.046.257 records!
jan-30-16 02:42:59 [Worker_10001] Database import started for table hmmdb
jan-30-16 02:43:01 [Worker_10001] Trying Bulkimport for table hmmdb
jan-30-16 02:43:01 [Worker_10001] Database: MySQL 5.5.47-cll
jan-30-16 02:43:03 [Worker_10001] Added 1000 of 1046257 records for table
hmmdb - finished in 1045 sec
jan-30-16 02:43:03 [Worker_10001] Added 2000 of 1046257 records for table
hmmdb - finished in 522 sec
jan-30-16 02:43:03 [Worker_10001] Added 3000 of 1046257 records for table
hmmdb - finished in 347 sec
jan-30-16 02:43:03 [Worker_10001] Added 4000 of 1046257 records for table
hmmdb - finished in 260 sec
(...)
jan-30-16 02:44:40 [Worker_10001] Added 1036000 of 1046257 records for
table hmmdb - finished in 0 sec
jan-30-16 02:44:44 [Worker_10001] Bulkimport for table hmmdb finished
jan-30-16 02:44:44 [Worker_10001] Successfully added 1046257 records in to
table hmmdb
jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov Model
with 1.046.257 records!
jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov Model!
HMM-check is now enabled again!





2016-01-28 22:44 GMT-02:00 Dossy Shiobara <[hidden email]>:

> I am using BerkeleyDB.  What does the log message string look like if it
> was transferred correctly so I can search for it?
>
>
> On 1/28/16 5:30 PM, Alexandre de Arruda Paes wrote:
> > If you use a database (like mysql), search in maillog if this records was
> > tranfered correctly after the rebuilddb terminate.
> > Here, if this occurs, the message is the same as yours.
>
> --
> Dossy Shiobara         |      "He realized the fastest way to change
> [hidden email]     |   is to laugh at your own folly -- then you
> http://panoptic.com/   |   can let go and quickly move on." (p. 70)
>   * WordPress * jQuery * MySQL * Security * Business Continuity *
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Assp-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-user
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
Okay, so... I'm going to include the entire snippet at the bottom of
this email, but I'm going to highlight sections here.

First:

Jan-30-16 12:05:01 [Worker_10001] File Count:   10,831
Jan-30-16 12:05:01 [Worker_10001] Processing... spam with 10,831 files
Jan-30-16 12:05:01 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:05:01 in folder spam
Jan-30-16 12:15:13 [Worker_10001] Removed Old:  81

10 minutes to remove 81 old files?  I'm guessing it's stat()'ing each
and every file in some terribly inefficient way, because:

$ time find . -mtime +31 -ls | wc -l
0

real    0m0.048s
user    0m0.009s
sys     0m0.041s

$ time find . -mtime +30 -ls | wc -l
80

real    0m0.046s
user    0m0.005s
sys     0m0.043s

find(1) needs less than 0.04s to find all 80 files that are older than
30 days.  Can I turn off ASSP's expiration of old files and just cron a
find/rm script to do it, if ASSP is going to take 10 minutes?

Similarly, the scan of the notspam folder:

Jan-30-16 12:15:13 [Worker_10001] File Count:   6,917
Jan-30-16 12:15:13 [Worker_10001] Processing... notspam with 6,917 files
Jan-30-16 12:15:13 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:15:13 in folder notspam
Jan-30-16 12:25:13 [Worker_10001] Removed Old:  34

10 minutes?  Is there some kind of sleep() that's in there that makes
that step take 10 minutes regardless of the time it takes to process the
files?  10 minutes for 10,831 files and 10 minutes for 6,917 files ...
not some linear time-per-file duration, seems really strange.

And, I see:

Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model! HMM-check is now enabled again!

Yet, I still get those "HMM-Check has given less than 6 results"
errors.  Is something else missing?


___ $ grep Worker_10001 logs/maillog.txt ___

Jan-30-16 12:05:00 [Worker_10001] Info: found module
/data/assp/lib/rebuildspamdb.pm version 7.26
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB uses BerkeleyDB for
temporary hashes
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB uses BerkeleyDB-ENV with
62.50 MByte
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB-thread
rebuildspamdb-version 7.26 started in ASSP version 2.4.7(16004)
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will create a Hidden
Markov Model
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will create unicode
enabled databases
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will process all words
as Sequence of UAX #29 Grapheme Clusters
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will normalize unicode
characters
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will use the
ASSP_WordStem engine
Jan-30-16 12:05:00 [Worker_10001] Maxfiles: 14,000
Jan-30-16 12:05:00 [Worker_10001] RebuildFileTimeLimit: 1 5
Jan-30-16 12:05:00 [Worker_10001] RebuildFileTimeLimit: files will be
moved away from the corpus if their processing takes longer than 5 second(s)
Jan-30-16 12:05:00 [Worker_10001] /data/assp/errors/spam
Jan-30-16 12:05:00 [Worker_10001] File Count:   11
Jan-30-16 12:05:00 [Worker_10001] Processing... errors/spam with 11 files
Jan-30-16 12:05:00 [Worker_10001] Ignore and remove files older than
Sep-13-88 13:05:00 in folder errors/spam
Jan-30-16 12:05:00 [Worker_10001] Imported Files for HeloBlackList:     10
Jan-30-16 12:05:00 [Worker_10001] Imported Files for Bayes/HMM: 10
Jan-30-16 12:05:00 [Worker_10001] Finished in 1 second(s)
Jan-30-16 12:05:00 [Worker_10001] /data/assp/errors/notspam
Jan-30-16 12:05:00 [Worker_10001] File Count:   1
Jan-30-16 12:05:00 [Worker_10001] Processing... errors/notspam with 1 files
Jan-30-16 12:05:00 [Worker_10001] Ignore and remove files older than
Sep-13-88 13:05:00 in folder errors/notspam
Jan-30-16 12:05:00 [Worker_10001] Imported Files for HeloBlackList:     0
Jan-30-16 12:05:00 [Worker_10001] Imported Files for Bayes/HMM: 0
Jan-30-16 12:05:00 [Worker_10001] Finished in 1 second(s)
Jan-30-16 12:05:00 [Worker_10001] Info: corpusnorm after processing
errors/spam and errors/notspam is spamwords 8280/ hamwords 0 => 10.000
Jan-30-16 12:05:01 [Worker_10001] Info: require approx. 6,292 files
(3,152,789 words) from folder spam to get the wanted corpusnorm (1.000)
Jan-30-16 12:05:01 [Worker_10001] /data/assp/spam
Jan-30-16 12:05:01 [Worker_10001] File Count:   10,831
Jan-30-16 12:05:01 [Worker_10001] Processing... spam with 10,831 files
Jan-30-16 12:05:01 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:05:01 in folder spam
Jan-30-16 12:15:13 [Worker_10001] Removed Old:  81
Jan-30-16 12:15:13 [Worker_10001] Imported Files for HeloBlackList:    
10,750
Jan-30-16 12:15:13 [Worker_10001] Imported Files for Bayes/HMM: 6,338
Jan-30-16 12:15:13 [Worker_10001] Finished in 612 second(s)
Jan-30-16 12:15:13 [Worker_10001] Info: require approx. all files
(3,161,976 words) from folder notspam to get the wanted corpusnorm (1.000)
Jan-30-16 12:15:13 [Worker_10001] /data/assp/notspam
Jan-30-16 12:15:13 [Worker_10001] File Count:   6,917
Jan-30-16 12:15:13 [Worker_10001] Processing... notspam with 6,917 files
Jan-30-16 12:15:13 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:15:13 in folder notspam
Jan-30-16 12:25:13 [Worker_10001] Removed Old:  34
Jan-30-16 12:25:13 [Worker_10001] Imported Files for HeloBlackList:    
6,883
Jan-30-16 12:25:13 [Worker_10001] Imported Files for Bayes/HMM: 6,917
Jan-30-16 12:25:13 [Worker_10001] Finished in 600 second(s)
Jan-30-16 12:25:29 [Worker_10001] Populating 513541 Spamdb records -
Bayesian check is now disabled
Jan-30-16 12:25:29 [Worker_10001] Try to lock Spamdb database in 5 second(s)
Jan-30-16 12:25:42 [Worker_10001] Done - populating Spamdb records -
513541 - Bayesian check is now enabled
Jan-30-16 12:25:42 [Worker_10001] Bayesian Pairs: 513,541 now in list
Jan-30-16 12:25:42 [Worker_10001] Generating consolidated
Hidden-Markov-Model database from 3,740,686 record model
Jan-30-16 12:27:37 [Worker_10001] HMM sequences: 1,830,724 now in list
Jan-30-16 12:27:37 [Worker_10001] Generating Spamdb.helo records from
7,487 collected HELO's
Jan-30-16 12:27:37 [Worker_10001] Cleaning old Spamdb.helo records
Jan-30-16 12:27:37 [Worker_10001] Done - cleaning old Spamdb.helo records
Jan-30-16 12:27:37 [Worker_10001] HELO Blacklist: 1 new, 0 now in list
Jan-30-16 12:27:37 [Worker_10001] Try to lock HMM databases in 5 second(s)
Jan-30-16 12:27:42 [Worker_10001] Start populating Hidden Markov Model.
HMM-check is disabled for this time!
Jan-30-16 12:27:42 [Worker_10001] Start populating Hidden Markov Model
with 1,830,724 records!
Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model with 1,830,724 records!
Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model! HMM-check is now enabled again!
Jan-30-16 12:28:14 [Worker_10001] Total processing time: 1,394 second(s)
Jan-30-16 12:28:14 [Worker_10001] Total processed data: 116.19 MByte
Jan-30-16 12:28:14 [Worker_10001] Rebuild processed 14.53 files per second.
Jan-30-16 12:28:14 [Worker_10001] After finishing the Rebuild process,
the /data/assp/tmpDB folder contains 899.74 MByte.
Jan-30-16 12:28:14 [Worker_10001] After finishing the Rebuild process,
the drive that contains the /data/assp/tmpDB folder has 1.11 GByte free
space from total 1.90 GByte.
Jan-30-16 12:28:14 [Worker_10001] Building new GripList records and
bounce report
Jan-30-16 12:28:14 [Worker_10001] Processing Logfile
/data/assp/logs/maillog.txt
Jan-30-16 12:28:14 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-29.maillog.txt
Jan-30-16 12:28:15 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-28.maillog.txt
Jan-30-16 12:28:15 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-27.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-26.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-25.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Downloading griplist.conf via direct
HTTP connection
Jan-30-16 12:28:17 [Worker_10001] Griplist.conf already up to date
Jan-30-16 12:28:17 [Worker_10001] Info: loaded GRIPLIST upload and
download URL's from /data/assp/griplist.conf
Jan-30-16 12:28:18 [Worker_10001] Submitted 5,583 bytes: 0 IPv6
addresses, 619 IPv4 addresses
Jan-30-16 12:28:18 [Worker_10001] Trashlist was saved to
/data/assp/trashlist.db




On 1/30/16 6:42 AM, Alexandre de Arruda Paes wrote:

> I don't know if in BerkeleyDB the result is the same, but see my log bellow.
>
>
> # grep Worker_10001 maillog.txt
>
>
> jan-30-16 02:42:53 [Worker_10001] Try to lock HMM databases in 5 second(s)
> jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model.
> HMM-check is disabled for this time!
> jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model with
> 1.046.257 records!
> jan-30-16 02:42:59 [Worker_10001] Database import started for table hmmdb
> jan-30-16 02:43:01 [Worker_10001] Trying Bulkimport for table hmmdb
> jan-30-16 02:43:01 [Worker_10001] Database: MySQL 5.5.47-cll
> jan-30-16 02:43:03 [Worker_10001] Added 1000 of 1046257 records for table
> hmmdb - finished in 1045 sec
> jan-30-16 02:43:03 [Worker_10001] Added 2000 of 1046257 records for table
> hmmdb - finished in 522 sec
> jan-30-16 02:43:03 [Worker_10001] Added 3000 of 1046257 records for table
> hmmdb - finished in 347 sec
> jan-30-16 02:43:03 [Worker_10001] Added 4000 of 1046257 records for table
> hmmdb - finished in 260 sec
> (...)
> jan-30-16 02:44:40 [Worker_10001] Added 1036000 of 1046257 records for
> table hmmdb - finished in 0 sec
> jan-30-16 02:44:44 [Worker_10001] Bulkimport for table hmmdb finished
> jan-30-16 02:44:44 [Worker_10001] Successfully added 1046257 records in to
> table hmmdb
> jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov Model
> with 1.046.257 records!
> jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov Model!
> HMM-check is now enabled again!
>
>
>
>
>
> 2016-01-28 22:44 GMT-02:00 Dossy Shiobara <[hidden email]>:
>
>> I am using BerkeleyDB.  What does the log message string look like if it
>> was transferred correctly so I can search for it?
>>
>>
>> On 1/28/16 5:30 PM, Alexandre de Arruda Paes wrote:
>>> If you use a database (like mysql), search in maillog if this records was
>>> tranfered correctly after the rebuilddb terminate.
>>> Here, if this occurs, the message is the same as yours.
>> --
>> Dossy Shiobara         |      "He realized the fastest way to change
>> [hidden email]     |   is to laugh at your own folly -- then you
>> http://panoptic.com/   |   can let go and quickly move on." (p. 70)
>>   * WordPress * jQuery * MySQL * Security * Business Continuity *
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>> _______________________________________________
>> Assp-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/assp-user
>>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>
>
> _______________________________________________
> Assp-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-user
--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Thomas Eckardt/eck
HMM may give less than 6 results, if the mail is too short, or a similar
was never seen.

Thomas





Von:    Dossy Shiobara <[hidden email]>
An:     For Users of ASSP <[hidden email]>
Datum:  30.01.2016 20:53
Betreff:        Re: [Assp-user] HMM-Check has given less than 6 results -
using monitoring mode only



Okay, so... I'm going to include the entire snippet at the bottom of
this email, but I'm going to highlight sections here.

First:

Jan-30-16 12:05:01 [Worker_10001] File Count:   10,831
Jan-30-16 12:05:01 [Worker_10001] Processing... spam with 10,831 files
Jan-30-16 12:05:01 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:05:01 in folder spam
Jan-30-16 12:15:13 [Worker_10001] Removed Old:  81

10 minutes to remove 81 old files?  I'm guessing it's stat()'ing each
and every file in some terribly inefficient way, because:

$ time find . -mtime +31 -ls | wc -l
0

real    0m0.048s
user    0m0.009s
sys     0m0.041s

$ time find . -mtime +30 -ls | wc -l
80

real    0m0.046s
user    0m0.005s
sys     0m0.043s

find(1) needs less than 0.04s to find all 80 files that are older than
30 days.  Can I turn off ASSP's expiration of old files and just cron a
find/rm script to do it, if ASSP is going to take 10 minutes?

Similarly, the scan of the notspam folder:

Jan-30-16 12:15:13 [Worker_10001] File Count:   6,917
Jan-30-16 12:15:13 [Worker_10001] Processing... notspam with 6,917 files
Jan-30-16 12:15:13 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:15:13 in folder notspam
Jan-30-16 12:25:13 [Worker_10001] Removed Old:  34

10 minutes?  Is there some kind of sleep() that's in there that makes
that step take 10 minutes regardless of the time it takes to process the
files?  10 minutes for 10,831 files and 10 minutes for 6,917 files ...
not some linear time-per-file duration, seems really strange.

And, I see:

Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model! HMM-check is now enabled again!

Yet, I still get those "HMM-Check has given less than 6 results"
errors.  Is something else missing?


___ $ grep Worker_10001 logs/maillog.txt ___

Jan-30-16 12:05:00 [Worker_10001] Info: found module
/data/assp/lib/rebuildspamdb.pm version 7.26
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB uses BerkeleyDB for
temporary hashes
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB uses BerkeleyDB-ENV with
62.50 MByte
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB-thread
rebuildspamdb-version 7.26 started in ASSP version 2.4.7(16004)
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will create a Hidden
Markov Model
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will create unicode
enabled databases
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will process all words
as Sequence of UAX #29 Grapheme Clusters
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will normalize unicode
characters
Jan-30-16 12:05:00 [Worker_10001] RebuildSpamDB will use the
ASSP_WordStem engine
Jan-30-16 12:05:00 [Worker_10001] Maxfiles: 14,000
Jan-30-16 12:05:00 [Worker_10001] RebuildFileTimeLimit: 1 5
Jan-30-16 12:05:00 [Worker_10001] RebuildFileTimeLimit: files will be
moved away from the corpus if their processing takes longer than 5
second(s)
Jan-30-16 12:05:00 [Worker_10001] /data/assp/errors/spam
Jan-30-16 12:05:00 [Worker_10001] File Count:   11
Jan-30-16 12:05:00 [Worker_10001] Processing... errors/spam with 11 files
Jan-30-16 12:05:00 [Worker_10001] Ignore and remove files older than
Sep-13-88 13:05:00 in folder errors/spam
Jan-30-16 12:05:00 [Worker_10001] Imported Files for HeloBlackList:     10
Jan-30-16 12:05:00 [Worker_10001] Imported Files for Bayes/HMM: 10
Jan-30-16 12:05:00 [Worker_10001] Finished in 1 second(s)
Jan-30-16 12:05:00 [Worker_10001] /data/assp/errors/notspam
Jan-30-16 12:05:00 [Worker_10001] File Count:   1
Jan-30-16 12:05:00 [Worker_10001] Processing... errors/notspam with 1
files
Jan-30-16 12:05:00 [Worker_10001] Ignore and remove files older than
Sep-13-88 13:05:00 in folder errors/notspam
Jan-30-16 12:05:00 [Worker_10001] Imported Files for HeloBlackList:     0
Jan-30-16 12:05:00 [Worker_10001] Imported Files for Bayes/HMM: 0
Jan-30-16 12:05:00 [Worker_10001] Finished in 1 second(s)
Jan-30-16 12:05:00 [Worker_10001] Info: corpusnorm after processing
errors/spam and errors/notspam is spamwords 8280/ hamwords 0 => 10.000
Jan-30-16 12:05:01 [Worker_10001] Info: require approx. 6,292 files
(3,152,789 words) from folder spam to get the wanted corpusnorm (1.000)
Jan-30-16 12:05:01 [Worker_10001] /data/assp/spam
Jan-30-16 12:05:01 [Worker_10001] File Count:   10,831
Jan-30-16 12:05:01 [Worker_10001] Processing... spam with 10,831 files
Jan-30-16 12:05:01 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:05:01 in folder spam
Jan-30-16 12:15:13 [Worker_10001] Removed Old:  81
Jan-30-16 12:15:13 [Worker_10001] Imported Files for HeloBlackList:
10,750
Jan-30-16 12:15:13 [Worker_10001] Imported Files for Bayes/HMM: 6,338
Jan-30-16 12:15:13 [Worker_10001] Finished in 612 second(s)
Jan-30-16 12:15:13 [Worker_10001] Info: require approx. all files
(3,161,976 words) from folder notspam to get the wanted corpusnorm (1.000)
Jan-30-16 12:15:13 [Worker_10001] /data/assp/notspam
Jan-30-16 12:15:13 [Worker_10001] File Count:   6,917
Jan-30-16 12:15:13 [Worker_10001] Processing... notspam with 6,917 files
Jan-30-16 12:15:13 [Worker_10001] Ignore and remove files older than
Dec-30-15 12:15:13 in folder notspam
Jan-30-16 12:25:13 [Worker_10001] Removed Old:  34
Jan-30-16 12:25:13 [Worker_10001] Imported Files for HeloBlackList:
6,883
Jan-30-16 12:25:13 [Worker_10001] Imported Files for Bayes/HMM: 6,917
Jan-30-16 12:25:13 [Worker_10001] Finished in 600 second(s)
Jan-30-16 12:25:29 [Worker_10001] Populating 513541 Spamdb records -
Bayesian check is now disabled
Jan-30-16 12:25:29 [Worker_10001] Try to lock Spamdb database in 5
second(s)
Jan-30-16 12:25:42 [Worker_10001] Done - populating Spamdb records -
513541 - Bayesian check is now enabled
Jan-30-16 12:25:42 [Worker_10001] Bayesian Pairs: 513,541 now in list
Jan-30-16 12:25:42 [Worker_10001] Generating consolidated
Hidden-Markov-Model database from 3,740,686 record model
Jan-30-16 12:27:37 [Worker_10001] HMM sequences: 1,830,724 now in list
Jan-30-16 12:27:37 [Worker_10001] Generating Spamdb.helo records from
7,487 collected HELO's
Jan-30-16 12:27:37 [Worker_10001] Cleaning old Spamdb.helo records
Jan-30-16 12:27:37 [Worker_10001] Done - cleaning old Spamdb.helo records
Jan-30-16 12:27:37 [Worker_10001] HELO Blacklist: 1 new, 0 now in list
Jan-30-16 12:27:37 [Worker_10001] Try to lock HMM databases in 5 second(s)
Jan-30-16 12:27:42 [Worker_10001] Start populating Hidden Markov Model.
HMM-check is disabled for this time!
Jan-30-16 12:27:42 [Worker_10001] Start populating Hidden Markov Model
with 1,830,724 records!
Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model with 1,830,724 records!
Jan-30-16 12:28:14 [Worker_10001] Finished populating Hidden Markov
Model! HMM-check is now enabled again!
Jan-30-16 12:28:14 [Worker_10001] Total processing time: 1,394 second(s)
Jan-30-16 12:28:14 [Worker_10001] Total processed data: 116.19 MByte
Jan-30-16 12:28:14 [Worker_10001] Rebuild processed 14.53 files per
second.
Jan-30-16 12:28:14 [Worker_10001] After finishing the Rebuild process,
the /data/assp/tmpDB folder contains 899.74 MByte.
Jan-30-16 12:28:14 [Worker_10001] After finishing the Rebuild process,
the drive that contains the /data/assp/tmpDB folder has 1.11 GByte free
space from total 1.90 GByte.
Jan-30-16 12:28:14 [Worker_10001] Building new GripList records and
bounce report
Jan-30-16 12:28:14 [Worker_10001] Processing Logfile
/data/assp/logs/maillog.txt
Jan-30-16 12:28:14 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-29.maillog.txt
Jan-30-16 12:28:15 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-28.maillog.txt
Jan-30-16 12:28:15 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-27.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-26.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Processing Logfile
/data/assp/logs/16-01-25.maillog.txt
Jan-30-16 12:28:16 [Worker_10001] Downloading griplist.conf via direct
HTTP connection
Jan-30-16 12:28:17 [Worker_10001] Griplist.conf already up to date
Jan-30-16 12:28:17 [Worker_10001] Info: loaded GRIPLIST upload and
download URL's from /data/assp/griplist.conf
Jan-30-16 12:28:18 [Worker_10001] Submitted 5,583 bytes: 0 IPv6
addresses, 619 IPv4 addresses
Jan-30-16 12:28:18 [Worker_10001] Trashlist was saved to
/data/assp/trashlist.db




On 1/30/16 6:42 AM, Alexandre de Arruda Paes wrote:
> I don't know if in BerkeleyDB the result is the same, but see my log
bellow.
>
>
> # grep Worker_10001 maillog.txt
>
>
> jan-30-16 02:42:53 [Worker_10001] Try to lock HMM databases in 5
second(s)
> jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model.
> HMM-check is disabled for this time!
> jan-30-16 02:42:59 [Worker_10001] Start populating Hidden Markov Model
with
> 1.046.257 records!
> jan-30-16 02:42:59 [Worker_10001] Database import started for table
hmmdb
> jan-30-16 02:43:01 [Worker_10001] Trying Bulkimport for table hmmdb
> jan-30-16 02:43:01 [Worker_10001] Database: MySQL 5.5.47-cll
> jan-30-16 02:43:03 [Worker_10001] Added 1000 of 1046257 records for
table
> hmmdb - finished in 1045 sec
> jan-30-16 02:43:03 [Worker_10001] Added 2000 of 1046257 records for
table
> hmmdb - finished in 522 sec
> jan-30-16 02:43:03 [Worker_10001] Added 3000 of 1046257 records for
table
> hmmdb - finished in 347 sec
> jan-30-16 02:43:03 [Worker_10001] Added 4000 of 1046257 records for
table
> hmmdb - finished in 260 sec
> (...)
> jan-30-16 02:44:40 [Worker_10001] Added 1036000 of 1046257 records for
> table hmmdb - finished in 0 sec
> jan-30-16 02:44:44 [Worker_10001] Bulkimport for table hmmdb finished
> jan-30-16 02:44:44 [Worker_10001] Successfully added 1046257 records in
to
> table hmmdb
> jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov
Model
> with 1.046.257 records!
> jan-30-16 02:44:44 [Worker_10001] Finished populating Hidden Markov
Model!
> HMM-check is now enabled again!
>
>
>
>
>
> 2016-01-28 22:44 GMT-02:00 Dossy Shiobara <[hidden email]>:
>
>> I am using BerkeleyDB.  What does the log message string look like if
it
>> was transferred correctly so I can search for it?
>>
>>
>> On 1/28/16 5:30 PM, Alexandre de Arruda Paes wrote:
>>> If you use a database (like mysql), search in maillog if this records
was

>>> tranfered correctly after the rebuilddb terminate.
>>> Here, if this occurs, the message is the same as yours.
>> --
>> Dossy Shiobara         |      "He realized the fastest way to change
>> [hidden email]     |   is to laugh at your own folly -- then you
>> http://panoptic.com/   |   can let go and quickly move on." (p. 70)
>>   * WordPress * jQuery * MySQL * Security * Business Continuity *
>>
>>
>>
>>
------------------------------------------------------------------------------

>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>> _______________________________________________
>> Assp-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/assp-user
>>
>
>
>
------------------------------------------------------------------------------

> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
>
>
> _______________________________________________
> Assp-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-user
--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
I'm seeing it in cases where neither of two things are true:

Jan-31-16 12:01:01 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> Message-Score: added
-10 (tlsValencePB) for SSL-TLS-connection-OK, total score for this
message is now -10
Jan-31-16 12:01:01 m1-59661-00321 [Worker_1] [TLS-in] [DKIM]
208.118.235.17 <lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] [scoring] DKIM domain mismatch - gnu.org found in
DKIMCache, but no DKIM-Signature found in mail header (Cache)
Jan-31-16 12:01:01 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] Message-Score: added 15 (dkimValencePB) for DKIM
domain mismatch - gnu.org found in DKIMCache, but no DKIM-Signature
found in mail header, total score for this message is now 5
Jan-31-16 12:01:02 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] info: remove IP-score from 208.118.235.17 - this mail
passed the SPF check
Jan-31-16 12:01:02 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] Message-Score: added -10 (spfpValencePB) for SPF
pass, total score for this message is now -5
Jan-31-16 12:01:03 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] HMM-Check has given less than 6 results - using
monitoring mode only
Jan-31-16 12:01:03 m1-59661-00321 [Worker_1] [TLS-in] [MessageOK]
208.118.235.17 <lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] message ok [lilypond user Digest Vol 158 Issue 188]
Jan-31-16 12:01:03 m1-59661-00321 [Worker_1] [TLS-in] 208.118.235.17
<lilypond-user-bounces+dossy=[hidden email]> to:
[hidden email] info: PB-IP-Score for '208.118.235.0' is 0, added 5
in this session

This is a message digest from a mailing list.  The email itself
including full headers contained 537 lines, 2,059 words for a total of
19,111 bytes.  What exactly is ASSP's definition of "too short"?

I receive emails from this mailing list regularly, so a similar message
has been seen daily for months now.  I just checked the notspam folder,
and I don't see any messages from the list appearing in the notspam
folder.  Is that what you mean by "similar"?

Regardless, I still have BayesAfterHMM set blank, so why isn't it doing
any Bayesian scoring (I have DoBayesian set to "score").

I have AddSpamProbHeader and AddConfidenceHeader enabled.

Here are the ASSP headers of the email in question:

X-Assp-ID: ASSP.nospam m1-59661-00321
X-Assp-Session: 7F0C0F4B5D28 (mail 1)
X-Assp-Envelope-From: lilypond-user-bounces+dossy=[hidden email]
X-Assp-Intended-For: [hidden email]
X-Assp-Version: 2.4.7(16004) on ASSP.nospam
X-Assp-Client-TLS: yes
X-Assp-Message-Score: -10 (SSL-TLS-connection-OK)
X-Assp-IP-Score: -10 (SSL-TLS-connection-OK)
X-Assp-Delay: not delayed (auto accepted); 31 Jan 2016 12:01:01 -0500
X-Assp-Message-Score: 15 (DKIM domain mismatch - gnu.org found in
    DKIMCache, but no DKIM-Signature found in mail header)
X-Assp-IP-Score: 15 (DKIM domain mismatch - gnu.org found in DKIMCache, but
    no DKIM-Signature found in mail header)
X-Original-Authentication-Results: ASSP.nospam; spf=pass
X-Assp-Message-Score: -10 (SPF pass)
X-Assp-IP-Score: -10 (SPF pass)
X-Assp-Detected-URI: gnu.org(22), github.com(1), uminho.pt(8),
    mail.de(6)



On 1/31/16 8:57 AM, Thomas Eckardt wrote:
> HMM may give less than 6 results, if the mail is too short, or a similar
> was never seen.

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *



------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Thomas Eckardt/eck
In reply to this post by Dossy Shiobara
>Jan-28-16 09:28:22 Spam Weight    :   3,264,527
>Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258

>Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
>Jan-28-16 09:28:22 Corpus confidence:   0.06250000

Corpus confidence:   0.06250000 - this value is impossible (expected is
1.000) if  ->  Corpus norm: 0.9998 - (very good - balanced).
I think your Berkeley-DB ENV or DB is damaged for some or all BDB files -
but at least for HMMdb.

- shutdown assp
- remove all files (__*.* , *.bdb)  from assp/tmpDB/HMMdb
- do the same for spamdb
- remove assp/hmmdb.bdb and assp/spamdb.bdb
- start assp
- import any avalable backup for both DB's - or run a rebuildspamdb
- restart assp to force a recalculation of the used BDB cache

If you use any *nix - KEEP in MIND! Your init.d script for assp (stop
case) has to wait until assp has been finished - otherwise your BerkeleyDB
files WILL BE DESTROYED!
To be clear - I mean 'WILL BE DESTROYED' - not 'may be' or 'possibly' Some
damaging of BDB files is fixed by an assp internal BDB repair mechanism -
some , not all!

Thomas



Von:    Dossy Shiobara <[hidden email]>
An:     [hidden email]
Datum:  28.01.2016 17:17
Betreff:        [Assp-user] HMM-Check has given less than 6 results -
using   monitoring mode only



Recently, it seems my HMM and Bayes checks are no longer working?  In
mail log, I see:

"HMM-Check has given less than 6 results - using monitoring mode only"

I'll include my latest rebuildrun.txt, which looks like it ran
successfully.

Why is this happening?  I'm running ASSP 2.4.7(16004).  Also, it seems
like if I get this error, it doesn't even perform Bayesian scoring --
basically, spam that was previously being blocked is now being let
through...


---rebuildrun.txt---

Jan-28-16 09:05:00 RebuildSpamDB-thread rebuildspamdb-version 7.26
started in ASSP version 2.4.7(16004)

Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB for temporary hashes

Jan-28-16 09:05:00 RebuildSpamDB uses BerkeleyDB-ENV with 62.50 MByte

Jan-28-16 09:05:00 RebuildSpamDB will create a Hidden Markov Model

Jan-28-16 09:05:00 RebuildSpamDB will create unicode enabled databases

Jan-28-16 09:05:00 RebuildSpamDB will process all words as Sequence of
UAX #29 Grapheme Clusters

Jan-28-16 09:05:00 RebuildSpamDB will normalize unicode characters

Jan-28-16 09:05:00 RebuildSpamDB will use the ASSP_WordStem engine

Jan-28-16 09:05:00 ---ASSP Settings---
Jan-28-16 09:05:00 Do Not Collect Messages with RedListed address: Enabled
**Messages with RedListed addresses will be removed from the corpus!**

Jan-28-16 09:05:00 Do Not Collect RedRe Messages: Enabled
**Messages matching the RedRe will be removed from the corpus!**

Jan-28-16 09:05:00 Use Subject as Maillog Names: True
Jan-28-16 09:05:00 Maxbytes: 4,000
Jan-28-16 09:05:00 RebuildFileTimeLimit: 1 5
Jan-28-16 09:05:00 RebuildFileTimeLimit: files will be moved away from
the corpus if their processing takes longer than 5 second(s)

Jan-28-16 09:05:00 /data/assp/errors/spam
Jan-28-16 09:05:00 File Count:  11
Jan-28-16 09:05:00 Processing... errors/spam with 11 files
Jan-28-16 09:05:00 ignore and remove files older than Sep-11-88 10:05:00
in folder errors/spam
Jan-28-16 09:05:01 Imported Files for HeloBlackList:    10
Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        10
Jan-28-16 09:05:01 Finished in 1 second(s)

Jan-28-16 09:05:01 /data/assp/errors/notspam
Jan-28-16 09:05:01 File Count:  1
Jan-28-16 09:05:01 Processing... errors/notspam with 1 files
Jan-28-16 09:05:01 ignore and remove files older than Sep-11-88 10:05:01
in folder errors/notspam
Jan-28-16 09:05:01 Imported Files for HeloBlackList:    0
Jan-28-16 09:05:01 Imported Files for Bayes/HMM:        0
Jan-28-16 09:05:01 Finished in 1 second(s)
Jan-28-16 09:05:01 info: corpusnorm after processing errors/spam and
errors/notspam is Spam Weight: 8280 / Not-Spam Weight: 0 => norm: 10.000
Jan-28-16 09:05:01 info: require approx. 6,726 files (3,255,584 words)
from folder spam to get the wanted corpusnorm (1.000)

Jan-28-16 09:05:01 /data/assp/spam
Jan-28-16 09:05:01 File Count:  11,195
Jan-28-16 09:05:01 Processing... spam with 11,195 files
Jan-28-16 09:05:01 ignore and remove files older than Dec-28-15 09:05:01
in folder spam
Jan-28-16 09:15:31 Removed Old: 5
Jan-28-16 09:15:31 Imported Files for HeloBlackList:    11,190
Jan-28-16 09:15:31 Imported Files for Bayes/HMM:        6,672
Jan-28-16 09:15:31 Finished in 630 second(s)
Jan-28-16 09:15:31 info: require approx. all files (3,264,527 words)
from folder notspam to get the wanted corpusnorm (1.000)

Jan-28-16 09:15:31 /data/assp/notspam
Jan-28-16 09:15:31 File Count:  7,009
Jan-28-16 09:15:31 Processing... notspam with 7,009 files
Jan-28-16 09:15:31 ignore and remove files older than Dec-28-15 09:15:31
in folder notspam
Jan-28-16 09:25:53 Removed Old: 7
Jan-28-16 09:25:53 Imported Files for HeloBlackList:    7,002
Jan-28-16 09:25:53 Imported Files for Bayes/HMM:        6,992
Jan-28-16 09:25:53 Finished in 622 second(s)

Jan-28-16 09:25:53 Generating weighted Bayesian tuplets
Jan-28-16 09:26:10 populating Spamdb 503166 records - Bayesian check is
now disabled
Jan-28-16 09:26:23 done - populating Spamdb records - 503166 - Bayesian
check is now enabled
Jan-28-16 09:26:23 done - Generating weighted Bayesian tuplets

Jan-28-16 09:26:23 Bayesian Pairs: 503,166 now in list

Jan-28-16 09:26:23 Generating consolidated Hidden-Markov-Model database
from 3,772,337 record model
Jan-28-16 09:28:22 HMM sequences: 1,848,357 now in list

Jan-28-16 09:28:22 generating Spamdb.helo records from 7,502 collected
HELO's
Jan-28-16 09:28:22 cleaning old Spamdb.helo records
Jan-28-16 09:28:22 done - cleaning old Spamdb.helo records

Jan-28-16 09:28:22 HELO Blacklist: 4 new, 0 now in list

Jan-28-16 09:28:22 Spam Weight    :   3,264,527
Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258

Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
Jan-28-16 09:28:22 Corpus confidence:   0.06250000

Jan-28-16 09:28:27 Start populating Hidden Markov Model. HMM-check is
disabled for this time!
Jan-28-16 09:28:27 start populating Hidden Markov Model with 1,848,357
records!
Jan-28-16 09:28:59 Finished populating Hidden Markov Model with
1,848,357 records!
Jan-28-16 09:28:59 Finished populating Hidden Markov Model. HMM-check is
now enabled again!

Jan-28-16 09:28:59 Total processing time: 1,439 second(s)

Jan-28-16 09:28:59 Total processing data: 118.85 MByte


Jan-28-16 09:28:59 Rebuild processed 14.52 files per second.

Jan-28-16 09:28:59 After finishing the Rebuild process, the
/data/assp/tmpDB folder contains 791.45 MByte.

Jan-28-16 09:28:59 After finishing the Rebuild process, the drive that
contains the /data/assp/tmpDB folder has 1.22 GByte free space from
total 1.90 GByte.

Jan-28-16 09:28:59 building new GripList records and bounce report
Jan-28-16 09:28:59 processing Logfile /data/assp/logs/maillog.txt
Jan-28-16 09:28:59 processing Logfile /data/assp/logs/16-01-27.maillog.txt
Jan-28-16 09:29:01 processing Logfile /data/assp/logs/16-01-26.maillog.txt
Jan-28-16 09:29:02 processing Logfile /data/assp/logs/16-01-25.maillog.txt
Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-24.maillog.txt
Jan-28-16 09:29:03 processing Logfile /data/assp/logs/16-01-23.maillog.txt

Jan-28-16 09:29:03 skipping bounce report because 'DoNotCollectBounces'
is switched ON

Jan-28-16 09:29:03 Uploading Griplist via Direct Connection
Jan-28-16 09:29:04 Submitted 6,924 bytes: 0 IPv6 addresses, 768 IPv4
addresses

Jan-28-16 09:29:04 Trashlist was saved to /data/assp/trashlist.db

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user






DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Dossy Shiobara
Okay, thank you for the instructions!  I've followed them, and here's
the result of the rebuildspamdb:

Feb-02-16 14:16:41 Spam Weight    :   3,226,691
Feb-02-16 14:16:41 Not-Spam Weight:   3,227,259

Feb-02-16 14:16:41 Corpus norm: 0.9998 - (very good - balanced)
Feb-02-16 14:16:41 Corpus confidence:   1.00000000

I'm really not thrilled with BerkeleyDB (I've ran into very specific
problems using it with a threaded Tcl).  Any chance I could use SQLite3
instead?  I'm not keen on setting up a full-blown MySQL instance just
for this, if I can avoid it.

Any ideas as to why the cleanup of old files takes so long (a fixed 10
minutes) or a way I can disable it and just clean out the files myself
with cron job instead?


On 2/1/16 11:48 PM, Thomas Eckardt wrote:

>> Jan-28-16 09:28:22 Spam Weight    :   3,264,527
>> >Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258
>> >Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
>> >Jan-28-16 09:28:22 Corpus confidence:   0.06250000
> Corpus confidence:   0.06250000 - this value is impossible (expected is
> 1.000) if  ->  Corpus norm: 0.9998 - (very good - balanced).
> I think your Berkeley-DB ENV or DB is damaged for some or all BDB files -
> but at least for HMMdb.
>
> - shutdown assp
> - remove all files (__*.* , *.bdb)  from assp/tmpDB/HMMdb
> - do the same for spamdb
> - remove assp/hmmdb.bdb and assp/spamdb.bdb
> - start assp
> - import any avalable backup for both DB's - or run a rebuildspamdb
> - restart assp to force a recalculation of the used BDB cache
>
> If you use any *nix - KEEP in MIND! Your init.d script for assp (stop
> case) has to wait until assp has been finished - otherwise your BerkeleyDB
> files WILL BE DESTROYED!
> To be clear - I mean 'WILL BE DESTROYED' - not 'may be' or 'possibly' Some
> damaging of BDB files is fixed by an assp internal BDB repair mechanism -
> some , not all!

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Thomas Eckardt/eck
>I'm really not thrilled with BerkeleyDB
Why you use a DB engine, that you don't know?

>SQLite3

You may use any ANSI-SQL (RDB) database engine you know (and perl has a
driver for).

>I'm not keen on setting up a full-blown MySQL instance just
>for this, if I can avoid it.

But keep in mind! ASSP is not lightweight database user, if HMM and
Bayesian are used. It may possible, that 7 workers (each doing 600 SQL
queries in one second) are producing 4200 (or more) SQL queries in one
second. We saw conditions, were 5 ASSP V2 instances brought a well
designed (16 Cores, 200GB RAM, SSD) super big enterprise MySQL DB-server
to its physical end!

It is wrong to think about assp as a "simple" spam filter. If you want to
build an enterprise anti spam solution with assp - you WILL NEED:

enterprise hardware
enterprise software
enterprise IT knowledge

You may change the word "enterprise" to 'standard' , 'simple', 'advanced'
.....- the relation will be the same. (I'm not sure, that 'simple'
IT-knowkedge will be enough in any case)

Thomas



Von:    Dossy Shiobara <[hidden email]>
An:     For Users of ASSP <[hidden email]>
Datum:  02.02.2016 20:27
Betreff:        Re: [Assp-user] HMM-Check has given less than 6 results -
using monitoring mode only



Okay, thank you for the instructions!  I've followed them, and here's
the result of the rebuildspamdb:

Feb-02-16 14:16:41 Spam Weight    :   3,226,691
Feb-02-16 14:16:41 Not-Spam Weight:   3,227,259

Feb-02-16 14:16:41 Corpus norm: 0.9998 - (very good - balanced)
Feb-02-16 14:16:41 Corpus confidence:   1.00000000

I'm really not thrilled with BerkeleyDB (I've ran into very specific
problems using it with a threaded Tcl).  Any chance I could use SQLite3
instead?  I'm not keen on setting up a full-blown MySQL instance just
for this, if I can avoid it.

Any ideas as to why the cleanup of old files takes so long (a fixed 10
minutes) or a way I can disable it and just clean out the files myself
with cron job instead?


On 2/1/16 11:48 PM, Thomas Eckardt wrote:
>> Jan-28-16 09:28:22 Spam Weight    :   3,264,527
>> >Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258
>> >Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
>> >Jan-28-16 09:28:22 Corpus confidence:   0.06250000
> Corpus confidence:   0.06250000 - this value is impossible (expected is
> 1.000) if  ->  Corpus norm: 0.9998 - (very good - balanced).
> I think your Berkeley-DB ENV or DB is damaged for some or all BDB files
-

> but at least for HMMdb.
>
> - shutdown assp
> - remove all files (__*.* , *.bdb)  from assp/tmpDB/HMMdb
> - do the same for spamdb
> - remove assp/hmmdb.bdb and assp/spamdb.bdb
> - start assp
> - import any avalable backup for both DB's - or run a rebuildspamdb
> - restart assp to force a recalculation of the used BDB cache
>
> If you use any *nix - KEEP in MIND! Your init.d script for assp (stop
> case) has to wait until assp has been finished - otherwise your
BerkeleyDB
> files WILL BE DESTROYED!
> To be clear - I mean 'WILL BE DESTROYED' - not 'may be' or 'possibly'
Some
> damaging of BDB files is fixed by an assp internal BDB repair mechanism
-
> some , not all!

--
Dossy Shiobara         |      "He realized the fastest way to change
[hidden email]     |   is to laugh at your own folly -- then you
http://panoptic.com/   |   can let go and quickly move on." (p. 70)
  * WordPress * jQuery * MySQL * Security * Business Continuity *


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user






DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user
Reply | Threaded
Open this post in threaded view
|

Re: HMM-Check has given less than 6 results - using monitoring mode only

Alexandre de Arruda Paes
In reply to this post by Dossy Shiobara
Now, I have a similar problem. My database is OK.

fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f DKIM-Signature found
fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f Message-Score: added 3 for
186.195.241.0 in griplist (0.83), total score for this message is now 3
fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f [scoring] DKIM signature
verified-OK - header-passed - sender policy is: neutral - author policy is:
neutral
fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f info: domain criatus.com.br has
published a DMARC record
fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f [scoring] SPF: pass
ip=186.195.241.96 mailfrom=[hidden email]
helo=mm241-96.criatus.com.br
fev-03-16 11:01:26 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f Message-Score: added -1
(spfpValencePB) for SPF pass, total score for this message is now 2
fev-03-16 11:01:28 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f [scoring] (URIBL: neutral,
criatus.com.br listed in multi.surbl.org
fev-03-16 11:01:28 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f Message-Score: added 22 for
URIBL: neutral, criatus.com.br listed in multi.surbl.org, total score for
this message is now 24
fev-03-16 11:01:28 id-04484-02550 [Worker_5] 186.195.241.96 <
[hidden email]> to: jose@f HMM-Check has given less than 6
results - using monitoring mode only
fev-03-16 11:01:28 id-04484-02550 [Worker_5] [MessageOK] 186.195.241.96 <
[hidden email]> to: jose@f message ok [100 de 4G Maz LG
Leon dual Tarifa Zero de Voz e SMS] -> /mnt/extras/assp/okmail/--349107.eml

But, if I copy and past the contents of --349107.eml in mail analyze in GUI:

*Bayesian Spam Probability:*

*combined probability*: 1.00000000 - got 186 - used 60 most significant
results
------------------------------

*Hidden-Markov-Model Spam Probability:*

*combined HMM spam probability*: 1.0000 - got 191 - used 60 most
significant results


And bayesian is not checked after HMM.

ASSP 2.4.4(15004)


Best regards!



2016-02-02 17:23 GMT-02:00 Dossy Shiobara <[hidden email]>:

> Okay, thank you for the instructions!  I've followed them, and here's
> the result of the rebuildspamdb:
>
> Feb-02-16 14:16:41 Spam Weight    :   3,226,691
> Feb-02-16 14:16:41 Not-Spam Weight:   3,227,259
>
> Feb-02-16 14:16:41 Corpus norm: 0.9998 - (very good - balanced)
> Feb-02-16 14:16:41 Corpus confidence:   1.00000000
>
> I'm really not thrilled with BerkeleyDB (I've ran into very specific
> problems using it with a threaded Tcl).  Any chance I could use SQLite3
> instead?  I'm not keen on setting up a full-blown MySQL instance just
> for this, if I can avoid it.
>
> Any ideas as to why the cleanup of old files takes so long (a fixed 10
> minutes) or a way I can disable it and just clean out the files myself
> with cron job instead?
>
>
> On 2/1/16 11:48 PM, Thomas Eckardt wrote:
> >> Jan-28-16 09:28:22 Spam Weight    :   3,264,527
> >> >Jan-28-16 09:28:22 Not-Spam Weight:   3,265,258
> >> >Jan-28-16 09:28:22 Corpus norm: 0.9998 - (very good - balanced)
> >> >Jan-28-16 09:28:22 Corpus confidence:   0.06250000
> > Corpus confidence:   0.06250000 - this value is impossible (expected is
> > 1.000) if  ->  Corpus norm: 0.9998 - (very good - balanced).
> > I think your Berkeley-DB ENV or DB is damaged for some or all BDB files -
> > but at least for HMMdb.
> >
> > - shutdown assp
> > - remove all files (__*.* , *.bdb)  from assp/tmpDB/HMMdb
> > - do the same for spamdb
> > - remove assp/hmmdb.bdb and assp/spamdb.bdb
> > - start assp
> > - import any avalable backup for both DB's - or run a rebuildspamdb
> > - restart assp to force a recalculation of the used BDB cache
> >
> > If you use any *nix - KEEP in MIND! Your init.d script for assp (stop
> > case) has to wait until assp has been finished - otherwise your
> BerkeleyDB
> > files WILL BE DESTROYED!
> > To be clear - I mean 'WILL BE DESTROYED' - not 'may be' or 'possibly'
> Some
> > damaging of BDB files is fixed by an assp internal BDB repair mechanism -
> > some , not all!
>
> --
> Dossy Shiobara         |      "He realized the fastest way to change
> [hidden email]     |   is to laugh at your own folly -- then you
> http://panoptic.com/   |   can let go and quickly move on." (p. 70)
>   * WordPress * jQuery * MySQL * Security * Business Continuity *
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Assp-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-user
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-user