Unable to run versions newer than 16018

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to run versions newer than 16018

Scott MacLean-4
I'm still unable to run versions of ASSP newer than build 16018. I'm
seeing the same exact behavior on any version newer than 16018, but
reverting to 16018 fixes the problem.

Something is killing the main thread, or causing long delays, which just
makes everything start to fail.

ASSP will run for about five minutes after it is started, with no
problems. Then it will start getting slow. I'll see things like this in
the log:

Feb-10-16 10:28:29 [Worker_10000] Warning: Worker_10000 - check the
'ADO' database connections has taken 32.774 seconds (max=1.000s)
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server x.x.x.x:
ResponseTime = 1232 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server y.y.y.y:
ResponseTime = 282 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server z.z.z.z:
ResponseTime = 11 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: switched (DNS) nameserver order
from x.x.x.x , y.y.y.y , z.z.z.z to z.z.z.z , x.x.x.x , y.y.y.y

The nameserver on IP x.x.x.x is on the same machine as ASSP, and
absolutely nothing is going on with the machine. All eight cores of the
CPU are essentially sitting idle, there are many GB of memory free, the
disk queue is sitting at near zero - the machine is essentially idle, so
there is no reason why a DNS query would take 1232 ms. I can only assume
then that the DNS query test that is being run by ASSP is taking too
long because of a problem within ASSP, and the response time being shown
is invalid.

Similarly, the SQL server, also running on the local machine, is 100%
healthy and is sitting idle, waiting for a query from ASSP. There is no
way it should be sitting there for 32 seconds waiting for a connection.

After this ASSP starts sending some queued email:

Feb-10-16 10:28:32 [Worker_10000] Info: looking for files to (re)send
Feb-10-16 10:28:42 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000153.eml
Feb-10-16 10:28:48 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000153.eml (first time)
Feb-10-16 10:28:49 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000153.eml - From: [hidden email] - To:
[hidden email]
Feb-10-16 10:28:49 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000153.eml to host: 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:28:50 NB-18099-00275 [Worker_2] [MessageOK] y.y.y.y
<[hidden email]> to: [hidden email] message ok - (whiteListedIPs
'y.y.y.y/32 New Web server') - [New password activation] ->
D:/ASSP/notspam/275.eml
Feb-10-16 10:28:58 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:00 [Worker_5] Worker_5 wakes up
Feb-10-16 10:29:01 [Main_Thread] IP 127.0.0.1 matches
allowStatConnectionsFrom - with 127.0.0.1/32
Feb-10-16 10:29:03 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000153.eml to 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000224.eml
Feb-10-16 10:29:04 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000224.eml (first time)
Feb-10-16 10:29:04 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000224.eml - From: [hidden email] - To:
[hidden email]
Feb-10-16 10:29:04 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000224.eml to host: 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:05 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:06 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000224.eml to 127.0.0.1:3027 (smtpDestination)
Feb-10-16 10:29:07 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)

And then it dies. It will write literally thousands and thousands of
these "unable to detect any running worker" lines into the log - several
hundred of them per second. This goes on for a couple of minutes until
it just gives up and restarts:

Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: ConnectionTransferTimeOut (30
seconds) is now reached
Feb-10-16 10:31:37 [Main_Thread] Warning: Main_Thread is unable to
transfer connection to any worker - try again!
Feb-10-16 10:31:37 [Main_Thread] Info: notification message queued to
sent to [hidden email]
Feb-10-16 10:31:37 [Main_Thread] Error: Main_Thread is unable to
transfer connection to any worker within 120 seconds - restart ASSP!
Feb-10-16 10:31:38 [Main_Thread] Initializing shutdown sequence
Feb-10-16 10:31:40 [Shutdown] Info: removing all SMTP and Proxy listeners
Feb-10-16 10:31:41 [Shutdown] Tell Worker 3 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 4 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 5 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 1 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 2 - QUIT
Feb-10-16 10:31:41 [Shutdown] Waiting for all SMTP-Workers to be finished
Feb-10-16 10:31:45 [Shutdown] Warning: poll cycle (2) has taken
3.1647617816925 seconds - this is very much is too long
Feb-10-16 10:32:08 NB-18036-09167 [Worker_3] 22.22.22.22
<[hidden email]> to: [hidden email] info: received all data - all
data moved to send queue (8)
Feb-10-16 10:32:08 [Worker_3] Worker_3 has active connections. Will wait
until all connections are finished but max 45 seconds!
Feb-10-16 10:32:31 [Shutdown] Error: at least one of the SMTP workers
has not finished work within 50 seconds
Feb-10-16 10:32:33 [Shutdown] Closing all databases
Feb-10-16 10:32:45 [Shutdown] Info: removing all WEB listeners
Feb-10-16 10:32:45 [Shutdown] Info: shutdown reason was: restarting
Feb-10-16 10:32:45 [Shutdown] ASSP finished work

I need to find out what has changed in ASSP, and what I can do to fix
this problem! Thomas, I did a compare between builds 16018 and 16021
(16018 works fine, 16021 and newer exhibits the above behavior), and see
very little - some SSL changes, griplist upload, not much else.

Any idea where I could start to try to figure out what is going on?


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

GrayHat
:: On Wed, 10 Feb 2016 11:14:45 -0500
:: <[hidden email]>
:: Scott MacLean <[hidden email]> wrote:

> Any idea where I could start to try to figure out what is going on?

I'd try the following:

stop assp

remove the assp\sl-cache folder

run a

ppm update --install

once the update completes run a

ppm log --errors 60

check for update errors, fix them and repeat the update; done so, start
assp from the command line and let it run so; this way, in case of
errors or crashes, you'll see the full message(s) on the console






------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Colin
In reply to this post by Scott MacLean-4
Hi Scott,

You beat me to it today. I've been trying to figure out why one of our
mailservers keeps crashing. It is the only one running up to date. The
others are running 16025 without crashing.

I get the following:

2016-02-10 15:21:39 [Main_Thread] Info: Main_Thread got connection request
2016-02-10 15:21:39 [Worker_6] 175.101.68.117 info: no (more) data readable
from 175.101.68.117 (connection closed by peer)
2016-02-10 15:21:39 [Worker_6] Disconnected: session:7FA9A1A18E00
175.101.68.117 - command list was 'n/a' - used 1 SocketCalls - processing
time 0 seconds - damped 0 seconds
2016-02-10 15:21:40 [Worker_9] Info: cleaned 32 kbyte of memory from 2
closed SMTP connections
2016-02-10 15:22:10 [Main_Thread] Info: unable to detect any running worker
for a new connection - wait (max 30 seconds)
*repeat many times per second
2016-02-10 15:24:14 [Main_Thread] Info: ConnectionTransferTimeOut (30
seconds) is now reached
2016-02-10 15:24:14 [Main_Thread] Warning: Main_Thread is unable to
transfer connection to any worker - try again!
2016-02-10 15:24:14 [Main_Thread] Error: Main_Thread is unable to transfer
connection to any worker within 120 seconds - restart ASSP!
2016-02-10 15:24:14 [Main_Thread] Initializing shutdown sequence

I think I've had the monitoring kick in about 5 times and restart assp
today.

All the best,
Colin Waring

On Wed, Feb 10, 2016 at 4:14 PM, Scott MacLean <[hidden email]> wrote:

> I'm still unable to run versions of ASSP newer than build 16018. I'm
> seeing the same exact behavior on any version newer than 16018, but
> reverting to 16018 fixes the problem.
>
> Something is killing the main thread, or causing long delays, which just
> makes everything start to fail.
>
> ASSP will run for about five minutes after it is started, with no
> problems. Then it will start getting slow. I'll see things like this in
> the log:
>
> Feb-10-16 10:28:29 [Worker_10000] Warning: Worker_10000 - check the
> 'ADO' database connections has taken 32.774 seconds (max=1.000s)
> Feb-10-16 10:28:31 [Worker_10000] Info: Name Server x.x.x.x:
> ResponseTime = 1232 ms for sourceforge.net
> Feb-10-16 10:28:31 [Worker_10000] Info: Name Server y.y.y.y:
> ResponseTime = 282 ms for sourceforge.net
> Feb-10-16 10:28:31 [Worker_10000] Info: Name Server z.z.z.z:
> ResponseTime = 11 ms for sourceforge.net
> Feb-10-16 10:28:31 [Worker_10000] Info: switched (DNS) nameserver order
> from x.x.x.x , y.y.y.y , z.z.z.z to z.z.z.z , x.x.x.x , y.y.y.y
>
> The nameserver on IP x.x.x.x is on the same machine as ASSP, and
> absolutely nothing is going on with the machine. All eight cores of the
> CPU are essentially sitting idle, there are many GB of memory free, the
> disk queue is sitting at near zero - the machine is essentially idle, so
> there is no reason why a DNS query would take 1232 ms. I can only assume
> then that the DNS query test that is being run by ASSP is taking too
> long because of a problem within ASSP, and the response time being shown
> is invalid.
>
> Similarly, the SQL server, also running on the local machine, is 100%
> healthy and is sitting idle, waiting for a query from ASSP. There is no
> way it should be sitting there for 32 seconds waiting for a connection.
>
> After this ASSP starts sending some queued email:
>
> Feb-10-16 10:28:32 [Worker_10000] Info: looking for files to (re)send
> Feb-10-16 10:28:42 [Worker_10000] (re)send - try to open:
> D:/ASSP/resendmail/n000000153.eml
> Feb-10-16 10:28:48 [Worker_10000] (re)send - process:
> D:/ASSP/resendmail/n000000153.eml (first time)
> Feb-10-16 10:28:49 [Worker_10000] (re)send -
> D:/ASSP/resendmail/n000000153.eml - From: [hidden email] - To:
> [hidden email]
> Feb-10-16 10:28:49 [Worker_10000] (re)send
> D:/ASSP/resendmail/n000000153.eml to host: 127.0.0.1:6027
> (smtpDestination)
> Feb-10-16 10:28:50 NB-18099-00275 [Worker_2] [MessageOK] y.y.y.y
> <[hidden email]> to: [hidden email] message ok - (whiteListedIPs
> 'y.y.y.y/32 New Web server') - [New password activation] ->
> D:/ASSP/notspam/275.eml
> Feb-10-16 10:28:58 [Worker_10000] Info: Net::SMTP is used to send mail
> Feb-10-16 10:29:00 [Worker_5] Worker_5 wakes up
> Feb-10-16 10:29:01 [Main_Thread] IP 127.0.0.1 matches
> allowStatConnectionsFrom - with 127.0.0.1/32
> Feb-10-16 10:29:03 [Worker_10000] Info: successful sent file
> D:/ASSP/resendmail/n000000153.eml to 127.0.0.1:6027 (smtpDestination)
> Feb-10-16 10:29:04 [Worker_10000] (re)send - try to open:
> D:/ASSP/resendmail/n000000224.eml
> Feb-10-16 10:29:04 [Worker_10000] (re)send - process:
> D:/ASSP/resendmail/n000000224.eml (first time)
> Feb-10-16 10:29:04 [Worker_10000] (re)send -
> D:/ASSP/resendmail/n000000224.eml - From: [hidden email] - To:
> [hidden email]
> Feb-10-16 10:29:04 [Worker_10000] (re)send
> D:/ASSP/resendmail/n000000224.eml to host: 127.0.0.1:6027
> (smtpDestination)
> Feb-10-16 10:29:04 [Worker_10000] Info: Net::SMTP is used to send mail
> Feb-10-16 10:29:05 [Main_Thread] Info: no (more) data readable
> (connection possibly closed by browser)
> Feb-10-16 10:29:06 [Worker_10000] Info: successful sent file
> D:/ASSP/resendmail/n000000224.eml to 127.0.0.1:3027 (smtpDestination)
> Feb-10-16 10:29:07 [Main_Thread] Info: no (more) data readable
> (connection possibly closed by browser)
> Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
>
> And then it dies. It will write literally thousands and thousands of
> these "unable to detect any running worker" lines into the log - several
> hundred of them per second. This goes on for a couple of minutes until
> it just gives up and restarts:
>
> Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:31:37 [Main_Thread] Info: unable to detect any running
> worker for a new connection - wait (max 30 seconds)
> Feb-10-16 10:31:37 [Main_Thread] Info: ConnectionTransferTimeOut (30
> seconds) is now reached
> Feb-10-16 10:31:37 [Main_Thread] Warning: Main_Thread is unable to
> transfer connection to any worker - try again!
> Feb-10-16 10:31:37 [Main_Thread] Info: notification message queued to
> sent to [hidden email]
> Feb-10-16 10:31:37 [Main_Thread] Error: Main_Thread is unable to
> transfer connection to any worker within 120 seconds - restart ASSP!
> Feb-10-16 10:31:38 [Main_Thread] Initializing shutdown sequence
> Feb-10-16 10:31:40 [Shutdown] Info: removing all SMTP and Proxy listeners
> Feb-10-16 10:31:41 [Shutdown] Tell Worker 3 - QUIT
> Feb-10-16 10:31:41 [Shutdown] Tell Worker 4 - QUIT
> Feb-10-16 10:31:41 [Shutdown] Tell Worker 5 - QUIT
> Feb-10-16 10:31:41 [Shutdown] Tell Worker 1 - QUIT
> Feb-10-16 10:31:41 [Shutdown] Tell Worker 2 - QUIT
> Feb-10-16 10:31:41 [Shutdown] Waiting for all SMTP-Workers to be finished
> Feb-10-16 10:31:45 [Shutdown] Warning: poll cycle (2) has taken
> 3.1647617816925 seconds - this is very much is too long
> Feb-10-16 10:32:08 NB-18036-09167 [Worker_3] 22.22.22.22
> <[hidden email]> to: [hidden email] info: received all data - all
> data moved to send queue (8)
> Feb-10-16 10:32:08 [Worker_3] Worker_3 has active connections. Will wait
> until all connections are finished but max 45 seconds!
> Feb-10-16 10:32:31 [Shutdown] Error: at least one of the SMTP workers
> has not finished work within 50 seconds
> Feb-10-16 10:32:33 [Shutdown] Closing all databases
> Feb-10-16 10:32:45 [Shutdown] Info: removing all WEB listeners
> Feb-10-16 10:32:45 [Shutdown] Info: shutdown reason was: restarting
> Feb-10-16 10:32:45 [Shutdown] ASSP finished work
>
> I need to find out what has changed in ASSP, and what I can do to fix
> this problem! Thomas, I did a compare between builds 16018 and 16021
> (16018 works fine, 16021 and newer exhibits the above behavior), and see
> very little - some SSL changes, griplist upload, not much else.
>
> Any idea where I could start to try to figure out what is going on?
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Assp-test mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-test
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Thomas Eckardt/eck
In reply to this post by Scott MacLean-4
Scott, what is your setting of  'useDB4IntCache' - I think it is set to
'ON'. Switch it to 'OFF'and start a newer version. An restart is required
if, this value is changed!

I think this has something to do with the BDB-tied STATS hash. This is
anyway corrupt, and/or assp is running in to BDB-deadlock.

Tell me, if this change fixes the problem - you'll know it after 5
minutes.

Thomas





Von:    Scott MacLean <[hidden email]>
An:     ASSP Development Mailing List <[hidden email]>
Datum:  10.02.2016 17:18
Betreff:        [Assp-test] Unable to run versions newer than 16018



I'm still unable to run versions of ASSP newer than build 16018. I'm
seeing the same exact behavior on any version newer than 16018, but
reverting to 16018 fixes the problem.

Something is killing the main thread, or causing long delays, which just
makes everything start to fail.

ASSP will run for about five minutes after it is started, with no
problems. Then it will start getting slow. I'll see things like this in
the log:

Feb-10-16 10:28:29 [Worker_10000] Warning: Worker_10000 - check the
'ADO' database connections has taken 32.774 seconds (max=1.000s)
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server x.x.x.x:
ResponseTime = 1232 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server y.y.y.y:
ResponseTime = 282 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server z.z.z.z:
ResponseTime = 11 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: switched (DNS) nameserver order
from x.x.x.x , y.y.y.y , z.z.z.z to z.z.z.z , x.x.x.x , y.y.y.y

The nameserver on IP x.x.x.x is on the same machine as ASSP, and
absolutely nothing is going on with the machine. All eight cores of the
CPU are essentially sitting idle, there are many GB of memory free, the
disk queue is sitting at near zero - the machine is essentially idle, so
there is no reason why a DNS query would take 1232 ms. I can only assume
then that the DNS query test that is being run by ASSP is taking too
long because of a problem within ASSP, and the response time being shown
is invalid.

Similarly, the SQL server, also running on the local machine, is 100%
healthy and is sitting idle, waiting for a query from ASSP. There is no
way it should be sitting there for 32 seconds waiting for a connection.

After this ASSP starts sending some queued email:

Feb-10-16 10:28:32 [Worker_10000] Info: looking for files to (re)send
Feb-10-16 10:28:42 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000153.eml
Feb-10-16 10:28:48 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000153.eml (first time)
Feb-10-16 10:28:49 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000153.eml - From: [hidden email] - To:
[hidden email]
Feb-10-16 10:28:49 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000153.eml to host: 127.0.0.1:6027
(smtpDestination)
Feb-10-16 10:28:50 NB-18099-00275 [Worker_2] [MessageOK] y.y.y.y
<[hidden email]> to: [hidden email] message ok - (whiteListedIPs
'y.y.y.y/32 New Web server') - [New password activation] ->
D:/ASSP/notspam/275.eml
Feb-10-16 10:28:58 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:00 [Worker_5] Worker_5 wakes up
Feb-10-16 10:29:01 [Main_Thread] IP 127.0.0.1 matches
allowStatConnectionsFrom - with 127.0.0.1/32
Feb-10-16 10:29:03 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000153.eml to 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000224.eml
Feb-10-16 10:29:04 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000224.eml (first time)
Feb-10-16 10:29:04 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000224.eml - From: [hidden email] - To:
[hidden email]
Feb-10-16 10:29:04 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000224.eml to host: 127.0.0.1:6027
(smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:05 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:06 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000224.eml to 127.0.0.1:3027 (smtpDestination)
Feb-10-16 10:29:07 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)

And then it dies. It will write literally thousands and thousands of
these "unable to detect any running worker" lines into the log - several
hundred of them per second. This goes on for a couple of minutes until
it just gives up and restarts:

Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: ConnectionTransferTimeOut (30
seconds) is now reached
Feb-10-16 10:31:37 [Main_Thread] Warning: Main_Thread is unable to
transfer connection to any worker - try again!
Feb-10-16 10:31:37 [Main_Thread] Info: notification message queued to
sent to [hidden email]
Feb-10-16 10:31:37 [Main_Thread] Error: Main_Thread is unable to
transfer connection to any worker within 120 seconds - restart ASSP!
Feb-10-16 10:31:38 [Main_Thread] Initializing shutdown sequence
Feb-10-16 10:31:40 [Shutdown] Info: removing all SMTP and Proxy listeners
Feb-10-16 10:31:41 [Shutdown] Tell Worker 3 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 4 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 5 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 1 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 2 - QUIT
Feb-10-16 10:31:41 [Shutdown] Waiting for all SMTP-Workers to be finished
Feb-10-16 10:31:45 [Shutdown] Warning: poll cycle (2) has taken
3.1647617816925 seconds - this is very much is too long
Feb-10-16 10:32:08 NB-18036-09167 [Worker_3] 22.22.22.22
<[hidden email]> to: [hidden email] info: received all data - all
data moved to send queue (8)
Feb-10-16 10:32:08 [Worker_3] Worker_3 has active connections. Will wait
until all connections are finished but max 45 seconds!
Feb-10-16 10:32:31 [Shutdown] Error: at least one of the SMTP workers
has not finished work within 50 seconds
Feb-10-16 10:32:33 [Shutdown] Closing all databases
Feb-10-16 10:32:45 [Shutdown] Info: removing all WEB listeners
Feb-10-16 10:32:45 [Shutdown] Info: shutdown reason was: restarting
Feb-10-16 10:32:45 [Shutdown] ASSP finished work

I need to find out what has changed in ASSP, and what I can do to fix
this problem! Thomas, I did a compare between builds 16018 and 16021
(16018 works fine, 16021 and newer exhibits the above behavior), and see
very little - some SSL changes, griplist upload, not much else.

Any idea where I could start to try to figure out what is going on?


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test






DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Scott MacLean-4
In reply to this post by GrayHat
On 2/10/2016 11:31 AM, Grayhat wrote:

>
>> Any idea where I could start to try to figure out what is going on?
> I'd try the following:
>
> stop assp
>
> remove the assp\sl-cache folder
>
> run a
>
> ppm update --install
>
> once the update completes run a
>
> ppm log --errors 60
>
> check for update errors, fix them and repeat the update; done so, start
> assp from the command line and let it run so; this way, in case of
> errors or crashes, you'll see the full message(s) on the console
Thanks, I had tried this already - all my Perl modules are up to date.
Running ASSP in a console shows nothing - no errors are emitted other
than what is already being written to the ASSP log.

On 2/10/2016 12:00 PM, Thomas Eckardt wrote:

> Scott, what is your setting of  'useDB4IntCache' - I think it is set
> to 'ON'. Switch it to 'OFF'and start a newer version. An restart is
> required if, this value is changed!
>
> I think this has something to do with the BDB-tied STATS hash. This is
> anyway corrupt, and/or assp is running in to BDB-deadlock.
>
> Tell me, if this change fixes the problem - you'll know it after 5
> minutes.
>
My useDB4IntCache was already disabled - I am not running BerkelyDB
(ASSP is running against a MSSQL server).

I did quite a bit of work today trying to figure out what is going on.
At first, I noticed that my server was showing as "not healthy" - and I
could see why:


*Spamdb* has version: *2_14315_5.020001_UAX#29_UAX#15_WordStem1.27* -
required version: *2_14315_5.020001_UAX#29_UAX#15_WordStem2.01* ! Run a
rebuildspamdb to correct this!

Obviously now that I was running the newer version of WordStem, it
didn't want to use my old Spamdb.

So I tried to run a rebuildspamdb manually.  It got as far as:

Feb-10-16 20:26:08 File Count:    15,000
Feb-10-16 20:26:08 Processing... notspam with 15,000 files
Feb-10-16 20:26:12 ignore files older than Dec-12-15 20:26:08 in folder
notspam
Feb-10-16 20:27:16 Imported Files for HeloBlackList:    15,000
Feb-10-16 20:27:16 Imported Files for Bayes/HMM:    0
Feb-10-16 20:27:16 Finished in 68 second(s)

Feb-10-16 20:27:16 Generating weighted Bayesian tuplets
Out of memory!

The instant it wrote out the "Out of memory!" message, ASSP started
writing several thousand lines of:

Feb-10-16 20:27:57 [Worker_4] Warning: got unexpected signal SEGV in
Worker_4: package - main, file - sub main::BombWeight_Run, line - 97!

It then terminated with no error message - just exited without writing
any error, either to the log or to the console.

I suspect this may be what has been killing my server on a regular basis
- I had rebuildspamdb running as a cron job on a fairly regular basis.
Watching it run, the perl.exe process starts eating memory - beginning
at around the 500MB that ASSP normally uses, and steadily climbing
during the rebuildspamdb process to about 1.5 GB. When it gets to
"Generating weighted Bayesian tuplets" it very rapidly climbs to 1.9 GB,
then ASSP terminates, without completing the RebuildSpamDB process.

This is a fairly heavy duty server, running 64-bit Windows Server 2008,
with 16 GB RAM. However, I have never been able to get ASSP to run
successfully on a 64-bit version of Perl, so it is running on the x86
version of ActiveState Perl.

I think ASSP is just running out of memory when doing rebuildspamdb,
which causes it to terminate. This would also explain why I'm suddenly
getting far more uncaught spam than I used to - my spamdb hasn't been
updated in a while.

I think my solution is to reduce my spam and notspam folders down from
15,000 files, to make it somewhat more manageable and reduce memory
usage? Unless you have a better idea?

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Scott MacLean-4
Along these lines, is it possible to run rebuildspamdb OUTSIDE of the
ASSP process? Perhaps be able to call ASSP with a command-line parameter
that tells it not to spool up the message processing engine, but to just
run rebuildspamdb, and then exit. That way it could be run in a separate
process with its own memory space.

Alternatively, has anyone gotten ASSP to run successfully under a 64 bit
version of Perl on Windows? If I could just run it under an X64 Perl,
and have it use the approximately 9 free GB of RAM sitting unused on my
server, that would be ideal.

On 2/10/2016 9:33 PM, Scott MacLean wrote:

> On 2/10/2016 11:31 AM, Grayhat wrote:
>>> Any idea where I could start to try to figure out what is going on?
>> I'd try the following:
>>
>> stop assp
>>
>> remove the assp\sl-cache folder
>>
>> run a
>>
>> ppm update --install
>>
>> once the update completes run a
>>
>> ppm log --errors 60
>>
>> check for update errors, fix them and repeat the update; done so, start
>> assp from the command line and let it run so; this way, in case of
>> errors or crashes, you'll see the full message(s) on the console
> Thanks, I had tried this already - all my Perl modules are up to date.
> Running ASSP in a console shows nothing - no errors are emitted other
> than what is already being written to the ASSP log.
>
> On 2/10/2016 12:00 PM, Thomas Eckardt wrote:
>> Scott, what is your setting of  'useDB4IntCache' - I think it is set
>> to 'ON'. Switch it to 'OFF'and start a newer version. An restart is
>> required if, this value is changed!
>>
>> I think this has something to do with the BDB-tied STATS hash. This is
>> anyway corrupt, and/or assp is running in to BDB-deadlock.
>>
>> Tell me, if this change fixes the problem - you'll know it after 5
>> minutes.
>>
> My useDB4IntCache was already disabled - I am not running BerkelyDB
> (ASSP is running against a MSSQL server).
>
> I did quite a bit of work today trying to figure out what is going on.
> At first, I noticed that my server was showing as "not healthy" - and I
> could see why:
>
>
> *Spamdb* has version: *2_14315_5.020001_UAX#29_UAX#15_WordStem1.27* -
> required version: *2_14315_5.020001_UAX#29_UAX#15_WordStem2.01* ! Run a
> rebuildspamdb to correct this!
>
> Obviously now that I was running the newer version of WordStem, it
> didn't want to use my old Spamdb.
>
> So I tried to run a rebuildspamdb manually.  It got as far as:
>
> Feb-10-16 20:26:08 File Count:    15,000
> Feb-10-16 20:26:08 Processing... notspam with 15,000 files
> Feb-10-16 20:26:12 ignore files older than Dec-12-15 20:26:08 in folder
> notspam
> Feb-10-16 20:27:16 Imported Files for HeloBlackList:    15,000
> Feb-10-16 20:27:16 Imported Files for Bayes/HMM:    0
> Feb-10-16 20:27:16 Finished in 68 second(s)
>
> Feb-10-16 20:27:16 Generating weighted Bayesian tuplets
> Out of memory!
>
> The instant it wrote out the "Out of memory!" message, ASSP started
> writing several thousand lines of:
>
> Feb-10-16 20:27:57 [Worker_4] Warning: got unexpected signal SEGV in
> Worker_4: package - main, file - sub main::BombWeight_Run, line - 97!
>
> It then terminated with no error message - just exited without writing
> any error, either to the log or to the console.
>
> I suspect this may be what has been killing my server on a regular basis
> - I had rebuildspamdb running as a cron job on a fairly regular basis.
> Watching it run, the perl.exe process starts eating memory - beginning
> at around the 500MB that ASSP normally uses, and steadily climbing
> during the rebuildspamdb process to about 1.5 GB. When it gets to
> "Generating weighted Bayesian tuplets" it very rapidly climbs to 1.9 GB,
> then ASSP terminates, without completing the RebuildSpamDB process.
>
> This is a fairly heavy duty server, running 64-bit Windows Server 2008,
> with 16 GB RAM. However, I have never been able to get ASSP to run
> successfully on a 64-bit version of Perl, so it is running on the x86
> version of ActiveState Perl.
>
> I think ASSP is just running out of memory when doing rebuildspamdb,
> which causes it to terminate. This would also explain why I'm suddenly
> getting far more uncaught spam than I used to - my spamdb hasn't been
> updated in a while.
>
> I think my solution is to reduce my spam and notspam folders down from
> 15,000 files, to make it somewhat more manageable and reduce memory
> usage? Unless you have a better idea?
>
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>
>
> _______________________________________________
> Assp-test mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-test

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Thomas Eckardt/eck
>is it possible to run rebuildspamdb OUTSIDE of the
ASSP process?

No, because the rebuildspamdb building is a never ending process in V2.
Reported mails are analyzed immediatly and the corpus is corrected
according just in time.
And where would be the advantage? After the first run, the rebuild thread
takes around 100MB RAM permanently and several 100MB more if the analyze
is running.

>That way it could be run in a separate
process with its own memory space.

"own memory" - that's the reason why the rebuild process is running in a
separate thread and not in a separate process - it has to share memory
with the other threads.

>Alternatively, has anyone gotten ASSP to run successfully under a 64 bit
version of Perl on Windows?

Yes, but the installation take a long time, because all XS-code modules
have to be compiled from source. Strawberry Perl is build perfectly for
this.
You'll not have luck using ActivePerl X64.

Thomas






Von:    Scott MacLean <[hidden email]>
An:     [hidden email]
Datum:  11.02.2016 03:48
Betreff:        Re: [Assp-test] Unable to run versions newer than 16018



Along these lines, is it possible to run rebuildspamdb OUTSIDE of the
ASSP process? Perhaps be able to call ASSP with a command-line parameter
that tells it not to spool up the message processing engine, but to just
run rebuildspamdb, and then exit. That way it could be run in a separate
process with its own memory space.

Alternatively, has anyone gotten ASSP to run successfully under a 64 bit
version of Perl on Windows? If I could just run it under an X64 Perl,
and have it use the approximately 9 free GB of RAM sitting unused on my
server, that would be ideal.

On 2/10/2016 9:33 PM, Scott MacLean wrote:

> On 2/10/2016 11:31 AM, Grayhat wrote:
>>> Any idea where I could start to try to figure out what is going on?
>> I'd try the following:
>>
>> stop assp
>>
>> remove the assp\sl-cache folder
>>
>> run a
>>
>> ppm update --install
>>
>> once the update completes run a
>>
>> ppm log --errors 60
>>
>> check for update errors, fix them and repeat the update; done so, start
>> assp from the command line and let it run so; this way, in case of
>> errors or crashes, you'll see the full message(s) on the console
> Thanks, I had tried this already - all my Perl modules are up to date.
> Running ASSP in a console shows nothing - no errors are emitted other
> than what is already being written to the ASSP log.
>
> On 2/10/2016 12:00 PM, Thomas Eckardt wrote:
>> Scott, what is your setting of  'useDB4IntCache' - I think it is set
>> to 'ON'. Switch it to 'OFF'and start a newer version. An restart is
>> required if, this value is changed!
>>
>> I think this has something to do with the BDB-tied STATS hash. This is
>> anyway corrupt, and/or assp is running in to BDB-deadlock.
>>
>> Tell me, if this change fixes the problem - you'll know it after 5
>> minutes.
>>
> My useDB4IntCache was already disabled - I am not running BerkelyDB
> (ASSP is running against a MSSQL server).
>
> I did quite a bit of work today trying to figure out what is going on.
> At first, I noticed that my server was showing as "not healthy" - and I
> could see why:
>
>
> *Spamdb* has version: *2_14315_5.020001_UAX#29_UAX#15_WordStem1.27* -
> required version: *2_14315_5.020001_UAX#29_UAX#15_WordStem2.01* ! Run a
> rebuildspamdb to correct this!
>
> Obviously now that I was running the newer version of WordStem, it
> didn't want to use my old Spamdb.
>
> So I tried to run a rebuildspamdb manually.  It got as far as:
>
> Feb-10-16 20:26:08 File Count:    15,000
> Feb-10-16 20:26:08 Processing... notspam with 15,000 files
> Feb-10-16 20:26:12 ignore files older than Dec-12-15 20:26:08 in folder
> notspam
> Feb-10-16 20:27:16 Imported Files for HeloBlackList:    15,000
> Feb-10-16 20:27:16 Imported Files for Bayes/HMM:    0
> Feb-10-16 20:27:16 Finished in 68 second(s)
>
> Feb-10-16 20:27:16 Generating weighted Bayesian tuplets
> Out of memory!
>
> The instant it wrote out the "Out of memory!" message, ASSP started
> writing several thousand lines of:
>
> Feb-10-16 20:27:57 [Worker_4] Warning: got unexpected signal SEGV in
> Worker_4: package - main, file - sub main::BombWeight_Run, line - 97!
>
> It then terminated with no error message - just exited without writing
> any error, either to the log or to the console.
>
> I suspect this may be what has been killing my server on a regular basis
> - I had rebuildspamdb running as a cron job on a fairly regular basis.
> Watching it run, the perl.exe process starts eating memory - beginning
> at around the 500MB that ASSP normally uses, and steadily climbing
> during the rebuildspamdb process to about 1.5 GB. When it gets to
> "Generating weighted Bayesian tuplets" it very rapidly climbs to 1.9 GB,
> then ASSP terminates, without completing the RebuildSpamDB process.
>
> This is a fairly heavy duty server, running 64-bit Windows Server 2008,
> with 16 GB RAM. However, I have never been able to get ASSP to run
> successfully on a 64-bit version of Perl, so it is running on the x86
> version of ActiveState Perl.
>
> I think ASSP is just running out of memory when doing rebuildspamdb,
> which causes it to terminate. This would also explain why I'm suddenly
> getting far more uncaught spam than I used to - my spamdb hasn't been
> updated in a while.
>
> I think my solution is to reduce my spam and notspam folders down from
> 15,000 files, to make it somewhat more manageable and reduce memory
> usage? Unless you have a better idea?
>
>
>
>
------------------------------------------------------------------------------

> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>
>
> _______________________________________________
> Assp-test mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-test
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally
privileged and protected in law and are intended solely for the use of the

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test
Reply | Threaded
Open this post in threaded view
|

Re: Unable to run versions newer than 16018

Colin
In reply to this post by Scott MacLean-4
Hi Scott,

I too have got perl modules up to date although I am running on Ubuntu LTS
14.04 64 bit.

I only have the rebuild running once per day and only running from on of my
ASSP instances as they have a shared corpus.

The instance that was locking up was not running the rebuild at all and I'm
not seeing any out of memory errors.

I wonder if that is either unrelated or a different symptom of the same
problem.

Thinking on, I also saw two other issues around the same time. The first
was that I had to set the remote monitoring server to not use starttls
otherwise I kept get false down reports.

Last night when downgrading I noticed a lot of SSL renegotiations and SSL
want read first on larger messages but it haven't had time to investigate
those further.

All the best,
Colin.
On 11 Feb 2016 02:37, "Scott MacLean" <[hidden email]> wrote:

> On 2/10/2016 11:31 AM, Grayhat wrote:
> >
> >> Any idea where I could start to try to figure out what is going on?
> > I'd try the following:
> >
> > stop assp
> >
> > remove the assp\sl-cache folder
> >
> > run a
> >
> > ppm update --install
> >
> > once the update completes run a
> >
> > ppm log --errors 60
> >
> > check for update errors, fix them and repeat the update; done so, start
> > assp from the command line and let it run so; this way, in case of
> > errors or crashes, you'll see the full message(s) on the console
>
> Thanks, I had tried this already - all my Perl modules are up to date.
> Running ASSP in a console shows nothing - no errors are emitted other
> than what is already being written to the ASSP log.
>
> On 2/10/2016 12:00 PM, Thomas Eckardt wrote:
> > Scott, what is your setting of  'useDB4IntCache' - I think it is set
> > to 'ON'. Switch it to 'OFF'and start a newer version. An restart is
> > required if, this value is changed!
> >
> > I think this has something to do with the BDB-tied STATS hash. This is
> > anyway corrupt, and/or assp is running in to BDB-deadlock.
> >
> > Tell me, if this change fixes the problem - you'll know it after 5
> > minutes.
> >
>
> My useDB4IntCache was already disabled - I am not running BerkelyDB
> (ASSP is running against a MSSQL server).
>
> I did quite a bit of work today trying to figure out what is going on.
> At first, I noticed that my server was showing as "not healthy" - and I
> could see why:
>
>
> *Spamdb* has version: *2_14315_5.020001_UAX#29_UAX#15_WordStem1.27* -
> required version: *2_14315_5.020001_UAX#29_UAX#15_WordStem2.01* ! Run a
> rebuildspamdb to correct this!
>
> Obviously now that I was running the newer version of WordStem, it
> didn't want to use my old Spamdb.
>
> So I tried to run a rebuildspamdb manually.  It got as far as:
>
> Feb-10-16 20:26:08 File Count:    15,000
> Feb-10-16 20:26:08 Processing... notspam with 15,000 files
> Feb-10-16 20:26:12 ignore files older than Dec-12-15 20:26:08 in folder
> notspam
> Feb-10-16 20:27:16 Imported Files for HeloBlackList:    15,000
> Feb-10-16 20:27:16 Imported Files for Bayes/HMM:    0
> Feb-10-16 20:27:16 Finished in 68 second(s)
>
> Feb-10-16 20:27:16 Generating weighted Bayesian tuplets
> Out of memory!
>
> The instant it wrote out the "Out of memory!" message, ASSP started
> writing several thousand lines of:
>
> Feb-10-16 20:27:57 [Worker_4] Warning: got unexpected signal SEGV in
> Worker_4: package - main, file - sub main::BombWeight_Run, line - 97!
>
> It then terminated with no error message - just exited without writing
> any error, either to the log or to the console.
>
> I suspect this may be what has been killing my server on a regular basis
> - I had rebuildspamdb running as a cron job on a fairly regular basis.
> Watching it run, the perl.exe process starts eating memory - beginning
> at around the 500MB that ASSP normally uses, and steadily climbing
> during the rebuildspamdb process to about 1.5 GB. When it gets to
> "Generating weighted Bayesian tuplets" it very rapidly climbs to 1.9 GB,
> then ASSP terminates, without completing the RebuildSpamDB process.
>
> This is a fairly heavy duty server, running 64-bit Windows Server 2008,
> with 16 GB RAM. However, I have never been able to get ASSP to run
> successfully on a 64-bit version of Perl, so it is running on the x86
> version of ActiveState Perl.
>
> I think ASSP is just running out of memory when doing rebuildspamdb,
> which causes it to terminate. This would also explain why I'm suddenly
> getting far more uncaught spam than I used to - my spamdb hasn't been
> updated in a while.
>
> I think my solution is to reduce my spam and notspam folders down from
> 15,000 files, to make it somewhat more manageable and reduce memory
> usage? Unless you have a better idea?
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Assp-test mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/assp-test