I tried heat treating the garden knife tonight by lantern light

I over-tempered the long blade but got the chisel point how I wanted it. I could tell the difference while sharpening it.

I may try re-heat treating the blade again soon. During the day. Things happen so fast. At least this is something I can keep trying untill I get it right.

Google+: View post on Google+

.

Tuning the training rate of a NeuralMesh ANN

I’ve been running a NeuralMesh Artificial Neural Network for about a week now. The point of it is for SPAM detection. I am feeding it 7 different  spam scores from the b8 baysian library.

My 7 ‘aspects’ of email spam are:

  • to
  • from
  • full from
  • subject
  • full subject
  • header
  • body

I am using the To field as sometimes spam is sent to funny names, and sometimes real email is sent to a particular person. The From figures on both the email address and name it comes from – luckily some spammers put their product names in the from.  The Full From processes the from field on a whole, this should be good for duplicate messages – when I’ve seen it once before, I have a good idea for what it is. The subject and full subject operate on the same principle, however sometimes a random word is put into a subject, so the full subject isn’t necessary a solid repeat indicator. The header is designed to look for SPF records and IP addresses in addition to including the above bits. The body, well, that just the whole message.

I feed all these different scores into a 7 input, 5 neuron, 1 output ANN to let it sort out what’s meaningful and what’s not. Every time somebody deletes a message as spam or accepts an email as not spam, I use that as a training trigger for both the b8 Bayesian library and the NeuralMesh ANN library. Because my spam to not spam ration is so out of whack, the training loop doesn’t need many iterations for spam, but needs a lot for not spam.

To address this, I am using an adaptive training iteration system.  The closer to a pure spam score the system has, the fewer training iterations, the further from a pure spam score, the more iterations I do.

$learning_iterations = round((1 – $finalscore[‘AI Score’]) * 100);

This seems to be working well, judging by the table below of the last however many training sessions the ANN has underwent.  The fewer the number of Iterations, the higher the spam score, which the system has seen a lot of. Looking at the Start MSE compared to the end MSE, any training under 4 iterations doesn’t really change anything.  The higher number of iterations events are not spam, so the system is starting with a very high Start MSE, and the extra training iterations is bringing the End MSE right down.

You can see after every non-spam training session, it takes a couple of data sets before it settles back down to it’s usual boring 3s&4s for training. The data corpus is still pretty small, only a few non-spam data points that have been trained into the system so far.

I think I am going to skip all training under 3 iterations, as it seems to not really make any difference. It probably just loads the system up with extra data points, and will eventually slow it down as I accumulate more data. I loose a lot of the little finesse trainings, but I think having fewer meaningful trainings as the corpus drifts will be more resource friendly.

Iterations Start MSE End MSE Date Exec Time Off-line
66 0.978118000 0.000000000 16/12/2010 9:33:15 am 0.01188200 n
3 0.000087000 0.000087000 16/12/2010 8:45:08 am 0.00063600 n
3 0.000108000 0.000108000 16/12/2010 8:44:43 am 0.00059300 n
3 0.000098000 0.000098000 16/12/2010 8:44:41 am 0.00056200 n
3 0.000083000 0.000083000 16/12/2010 8:44:40 am 0.00061200 n
3 0.000107000 0.000107000 16/12/2010 8:44:39 am 0.00062100 n
4 0.000100000 0.000100000 16/12/2010 8:44:37 am 0.00078800 n
3 0.000078000 0.000078000 16/12/2010 8:44:36 am 0.00061500 n
3 0.000077000 0.000077000 16/12/2010 8:44:35 am 0.00069400 n
4 0.000113000 0.000113000 16/12/2010 8:44:33 am 0.00076700 n
3 0.000114000 0.000114000 16/12/2010 8:44:31 am 0.00060800 n
3 0.000069000 0.000068000 16/12/2010 8:44:30 am 0.00061600 n
4 0.000105000 0.000104000 16/12/2010 8:44:09 am 0.00075900 n
4 0.000101000 0.000101000 16/12/2010 8:44:08 am 0.00075800 n
4 0.000118000 0.000118000 16/12/2010 8:44:06 am 0.00075200 n
3 0.000119000 0.000119000 16/12/2010 8:44:04 am 0.00062000 n
4 0.000066000 0.000065000 16/12/2010 8:44:04 am 0.00086800 n
3 0.000119000 0.000119000 16/12/2010 8:44:01 am 0.00061100 n
3 0.000120000 0.000120000 16/12/2010 8:44:01 am 0.00063000 n
4 0.000123000 0.000123000 16/12/2010 8:43:58 am 0.00086400 n
4 0.000112000 0.000112000 16/12/2010 8:43:30 am 0.00080700 n
3 0.000113000 0.000113000 16/12/2010 8:43:29 am 0.00058800 n
4 0.000114000 0.000114000 16/12/2010 8:43:28 am 0.00076300 n
3 0.000128000 0.000128000 16/12/2010 8:43:27 am 0.00060900 n
4 0.000129000 0.000129000 16/12/2010 8:43:26 am 0.00088600 n
3 0.000105000 0.000105000 16/12/2010 8:43:24 am 0.00058300 n
4 0.000131000 0.000130000 16/12/2010 8:43:23 am 0.00076200 n
4 0.000082000 0.000082000 16/12/2010 8:43:21 am 0.00075600 n
3 0.000134000 0.000134000 16/12/2010 8:43:20 am 0.00061800 n
4 0.000122000 0.000122000 16/12/2010 8:43:04 am 0.00075100 n
4 0.000137000 0.000136000 16/12/2010 8:43:02 am 0.00079800 n
4 0.000139000 0.000138000 16/12/2010 8:43:02 am 0.00086800 n
4 0.000140000 0.000139000 16/12/2010 8:43:00 am 0.00073600 n
4 0.000142000 0.000141000 16/12/2010 8:42:59 am 0.00075300 n
4 0.000128000 0.000128000 16/12/2010 8:42:59 am 0.00076200 n
4 0.000099000 0.000098000 16/12/2010 8:42:57 am 0.00075400 n
4 0.000114000 0.000113000 16/12/2010 8:42:17 am 0.00084600 n
4 0.000147000 0.000146000 16/12/2010 8:39:23 am 0.00076900 n
4 0.000149000 0.000148000 16/12/2010 8:39:20 am 0.00079300 n
4 0.000136000 0.000135000 16/12/2010 8:39:18 am 0.00081300 n
4 0.000138000 0.000137000 16/12/2010 8:39:15 am 0.00075000 n
4 0.000158000 0.000157000 16/12/2010 8:39:13 am 0.00079700 n
4 0.000155000 0.000154000 16/12/2010 8:39:13 am 0.00074500 n
8 0.000291000 0.000160000 16/12/2010 8:39:12 am 0.00150400 n
19 0.000667000 0.000524000 16/12/2010 8:37:45 am 0.00386300 n
21 0.002455000 0.000771000 16/12/2010 8:37:43 am 0.00371700 n
84 0.075070000 0.000000000 16/12/2010 8:35:36 am 0.01497400 n
5 0.000146000 0.000144000 16/12/2010 8:34:20 am 0.00097200 n
5 0.000148000 0.000147000 16/12/2010 8:34:18 am 0.00094400 n
5 0.000151000 0.000149000 16/12/2010 8:34:17 am 0.00095600 n
5 0.000105000 0.000104000 16/12/2010 8:34:14 am 0.00096000 n
9 0.000123000 0.000121000 16/12/2010 8:34:03 am 0.00164900 n
9 0.000126000 0.000124000 16/12/2010 8:30:34 am 0.00171600 n
6 0.000129000 0.000127000 16/12/2010 8:30:13 am 0.00114000 n
5 0.000159000 0.000157000 16/12/2010 8:30:12 am 0.00092800 n
6 0.000169000 0.000167000 16/12/2010 8:30:11 am 0.00115800 n
6 0.000172000 0.000169000 16/12/2010 8:29:32 am 0.00231000 n
6 0.000150000 0.000148000 16/12/2010 8:29:29 am 0.00116100 n
6 0.000192000 0.000189000 16/12/2010 8:29:27 am 0.00110500 n
6 0.000105000 0.000104000 16/12/2010 8:29:17 am 0.00114200 n
8 0.000183000 0.000178000 16/12/2010 8:29:02 am 0.00164100 n
6 0.000195000 0.000191000 16/12/2010 8:28:14 am 0.00116900 n
5 0.000218000 0.000215000 16/12/2010 8:26:38 am 0.00098300 n
6 0.000200000 0.000197000 16/12/2010 8:26:13 am 0.00114300 n
10 0.000167000 0.000161000 16/12/2010 8:25:59 am 0.00187000 n
6 0.000229000 0.000216000 16/12/2010 8:22:49 am 0.00110500 n
2 0.004528000 0.004528000 16/12/2010 8:22:44 am 0.00034600 n
7 0.000176000 0.000172000 16/12/2010 8:22:41 am 0.00128300 n
8 0.000269000 0.000259000 16/12/2010 8:19:45 am 0.00146800 n
11 0.000334000 0.000311000 15/12/2010 5:19:48 pm 0.00199000 n
10 0.000340000 0.000319000 15/12/2010 4:10:16 pm 0.00186500 n
12 0.000869000 0.000527000 15/12/2010 2:47:10 pm 0.00221200 n
8 0.000471000 0.000441000 15/12/2010 2:46:57 pm 0.00165100 n
12 0.000552000 0.000488000 15/12/2010 1:14:04 pm 0.00221000 n
68 0.008667000 0.000560000 15/12/2010 1:12:59 pm 0.01240600 n
30 0.976014000 0.000002000 15/12/2010 1:12:19 pm 0.00550100 n
4 0.000014000 0.000014000 15/12/2010 1:11:02 pm 0.00086600 n
4 0.000371000 0.000366000 15/12/2010 11:13:43 am 0.00089800 n
4 0.000013000 0.000013000 15/12/2010 9:32:22 am 0.00078800 n
4 0.000022000 0.000022000 15/12/2010 9:32:10 am 0.00073700 n
4 0.000247000 0.000243000 15/12/2010 9:32:09 am 0.00079800 n
3 0.000052000 0.000052000 15/12/2010 9:32:08 am 0.00057900 n
4 0.000034000 0.000034000 15/12/2010 9:32:07 am 0.00079400 n
3 0.000154000 0.000153000 15/12/2010 9:32:06 am 0.00068800 n
4 0.000019000 0.000019000 15/12/2010 9:32:04 am 0.00081700 n
4 0.000017000 0.000017000 15/12/2010 9:32:03 am 0.00090600 n
4 0.000014000 0.000014000 15/12/2010 9:32:01 am 0.00079600 n
4 0.000045000 0.000045000 15/12/2010 9:31:59 am 0.00079000 n
4 0.000032000 0.000032000 15/12/2010 9:31:58 am 0.00074200 n
4 0.000019000 0.000019000 15/12/2010 9:31:57 am 0.00076000 n
4 0.000262000 0.000258000 15/12/2010 9:31:33 am 0.00082000 n
4 0.000094000 0.000093000 15/12/2010 9:30:55 am 0.00073200 n
3 0.000051000 0.000051000 15/12/2010 9:30:54 am 0.00061300 n
4 0.000014000 0.000014000 15/12/2010 9:30:53 am 0.00078600 n
4 0.000014000 0.000014000 15/12/2010 9:30:51 am 0.00077100 n
4 0.000282000 0.000278000 15/12/2010 9:30:49 am 0.00081600 n
4 0.000014000 0.000014000 15/12/2010 9:30:48 am 0.00074700 n
4 0.000014000 0.000014000 15/12/2010 9:30:47 am 0.00074000 n
4 0.000014000 0.000014000 15/12/2010 9:30:46 am 0.00080700 n
4 0.000020000 0.000020000 15/12/2010 9:30:45 am 0.00084900 n

I am probably going about doing this all wrong, but so far it is seaming to work. Let me know if there is a better way..