Character Recognition Using Neural Networks

APPLYING ARTIFICIAL NEURAL NETWORKS FOR

CHARACTER RECOGNITION.

Team Members: Santosh Ganti

Shrirang Deshpande

Heemanshu Midde

Prachi Varma

College name: A.C.Patil College of engineering.

CONTENTS:

Topics

1. Abstract

2. Introduction

3. Body

3.1 Handling NEURO.java

3.2 Structure of NEURO.java

3.3 Logic behind NEURO.java

4. Conclusion

5. Reference

ABSTRACT :

We have developed an application for recognition of digits {0–9} This paper gives a description of how an artificial neural network has been implemented for this application.

Artificial Neural Network is a system loosely modeled on the human brain. The system architecture consists of many interconnected neurons. Each neuron is linked to certain of its neighbors with varying coefficients of connectivity that represent the strengths of these connections. Learning is accomplished by adjusting these strengths to cause the overall network to output appropriate results. Most applications of neural networks fall into the following three categories:

Prediction:
Uses input values to predict some output. e.g. pick the best stocks in the market, predict weather

Classification:
Use input values to determine the classification. e.g. is the input the letter A?

Data Filtering:
Smooth an input signal. e.g. take the noise out of a telephone signal.

We have developed an artificial neural network called NEURO.java, which is capable of classification of handwritten digits. It involves a training phase where the network learns the handwriting and a generalization phase where the network recognizes further inputs made. This paper provides a detail description of the structure of this neural network along with the description of the logic used for character recognition.

The main idea of the paper is to project the simplicity by which neural networks can be used for basic character recognition. It does not include any analysis or comparative interpretation of someone else’s work.

INTRODUCTION:

We humans have the ability for optical character recognition. In other words, we can differentiate between different characters and recognize them as an A ,or B and so on. Can we imbed such an ability in a software and if we can , how can we?

If we try to understand what exactly happens when we are reading, we will realize that when we see the printed paper an image gets formed on the retina of the eye, some signals are sent to the brain and the brain cells called neurons have something called as intelligence due to which they can recognize the characters.

Now ,if we simulate this behavior in software , what we would be actually doing is creating artificial intelligence. This filed of artificial intelligence, which simulates the behavior of a biological neural network in order to perform intelligent tasks, is called artificial neural networks. A typical artificial neural network looks as shown in fig.1.

Fig.1 A typical neural network

The nodes correspond to the neurons of the brain. These can be implemented in software as program modules, which interact with each other by parameter passing. The input layer nodes are responsible for providing input to the network, the hidden layer nodes perform the required processing and the output node gives the final output. There are neural networks with more complex structures, varying number of nodes, different types of interconnections, etc. but this paper has been restricted to discuss only NEURO.java.

BODY:

1. Handling NEURO.java:

NEURO.java has been developed using java applets. It is a 1600:10:1 neural network i.e., it has 1600 input nodes , 10 hidden nodes and 1 output node.

Firstly a predefined map as shown in fig.2-1 is printed on paper, and then the user has to write the digits in his handwriting in the manner as shown in fig.2-2.

fig.2-1 fig.2-2

The complete paper is then scanned and stored as a bitmap image. There is a feature in java called ‘Grabpixels’, which returns the pixels of any specified rectangular region within an image. Using this feature and the predefined map we access the pixels representing the individual digits written in the specified squares. These pixels are then given to the network as the input. The first 10 rows are used for the training of the network and hence must contain the handwritten digits in the format of fig.2-2. These comprise the training phase inputs. The last row contains a sequence of random numbers, which the network has to recognize. These comprise the generalization phase inputs.

Once the applet has started, it reads this scanned image and trains itself to recognize the specified handwriting. The last row is then used for testing the network. The network on the basis of its previous training tries to recognize this random sequence of numbers and displays it on the window of the applet.

2. The structure of NEURO.java:

fig.3 NEURO.java

As explained before, the input nodes are responsible for providing the input to the network. Now how do we provide the scanned image as an input to NEURO.java?

We know that we have access to the pixels representing the individual digits in the squares of the predefined map. These squares are restricted to a size of side 40 pixels. Using ‘Grabpixels’ we convert this two dimensional array of pixels into a one-dimensional array of size 40*40. Using the brightness value of these pixels, every black pixel is assigned an integer of value +1, and every white pixel is assigned an integer of value –1.This array of integers will be provided to the input layer which has 40*40 i.e. 1600 input nodes.

The hidden layer nodes are responsible for processing the input and providing their output to the output node. There are 10 hidden nodes and each one of them will be trained to recognize a particular digit. Node 0 will recognize digit 0, node 1 will recognize digit 1 and so on. Thus, during training phase, the first row of all zeroes from fig.2-2 will be used to train node 0, the next row of all ones will be used to train node 1 and so on.

We can see from fig.3 that every input node is connected to every hidden node. Every such connection has a connection weight which represents the strength of the connection. It is this collection of weights, that represents the intelligence of the network, and changing these weights to generate the desired output is termed as training the network.

The output layer has a single node, which receives the outputs of all the hidden nodes. It is this node, which makes the final decision of recognizing the digit. Accordingly it displays the recognized digit on the window of the java applet.

3. The logic behind NEURO.java:

For simplicity of understanding, We are reducing the digit size on the image to a square of side 6 pixels, and the hidden layer is reduced to only node 0. This simplified network is shown in fig.4.

fig. 4

Since the image size is now 6*6 i.e. 36 pixels, we will be having 36 input nodes. Correspondingly, there would be 36 connection weights. These weights can take any real value between +1 and –1. The output of node 0 too can take any real value between +1 and -1. The logic of the network has been so designed, that, more closer the output of node 0 to +1, more is the probability that the input image corresponds to digit 0. Initially all the weights have an uniform arbitrary value, say 0.1, as shown in fig.5.

fig. 5

The values at the input node i.e. +1 for black pixel and –1 for white pixel, are multiplied to the corresponding weights and submitted to node 0, which will output the sum of all these products. As stated before, closer the output to +1, more is the probability that the input image corresponds to digit 0. In the training phase, since we are providing perfect images of 0, the desired output is +1. But since the network is still not trained, there would be an error in the output. This error is feedback to the network and the connections are changed, so as to reduce this error. Thus, the network is now learning. The change in weights is controlled by the following rule:-

wts[i] + = L * err * inp[i]

where,

wts[i] - one of the 36 weights

L - Learning rate

Err - error in the output

inp[i] - the corresponding input ( +1 for black pixel and –1 for white pixel.)

If we observe the formula carefully, the weight corresponding to a black pixel increases and the weight corresponding to a white pixel decreases. The amount of increase or decrease is determined by the learning rate L, which in this case is 0.2. At the end of the training phase the weights for node 0 will be as shown in fig.6.

fig. 6

Notice that the weights corresponding to the black pixels have increased and those corresponding to the white pixels have decreased. In a similar manner, the weights of the remaining nine nodes are adjusted according to the remaining nine digits. The network is now ready for the generalization phase. In this phase, the input image will be given to all the hidden nodes. Each node will generate its own output. Let us reconsider node 0. Every black pixel in the elevated area will result into a positive product and every black pixel outside the elevated area will result into a negative product. Now, if the input image is really a zero, then there will be more black pixels in the elevated area, and thus the output of node zero will be greater than any other hidden node, suggesting that the input image corresponds to digit 0. This final decision is made by the output node, which receives the outputs from all the hidden nodes and decides which one has the maximum output, thus recognizing the input character.

CONCLUSION:

This paper presents a clear idea of how easily one can implement artificial neural networks for character recognition. NEURON.java can be extended to recognize all ASCII characters, or foreign languages like Chinese, etc. It can also be adapted for signature matching for authorization at banks, or for image processing and for complex optical character recognition.

REFERENCE:

" Neural Networks-A comprehensive foundation"

- Simon Haykins ( Pearson Education Inc, 1999)