Creating A Credit Card OCR Application Part 1

Credit Card OCR

Advertisements

Throughout the computer vision and technology communities, credit card optical character recognition and detection has become a growing trend for those who want to creating a starting program to get their feet wet in computer vision and OCR. Although we wont be creating an corporate grade OCR detection software today, we will be creating an introductory program that hopefully as time goes on, we can modify and expand to not only include more features, but also teach us on more computer vision and OCR techniques.

To being this algorithm, we are going to use Google’s Tesseract for our OCR detection. Developed by HP labs decades ago and includes over a hundred different languages, Tesseract has become the staple open source algorithm for any project using OCR. To tweak and enhance our detection results, we will be using EmguCV, which has Tesseract built into it, which is perfect for our needs. If you want to look at the full code for this project, you can view the projects GitHub repo.

The image that we are using to test our detection can be found below. Although it is pretty simple, it is perfect for our beginning program, and we will improve on our detection methods with harder images later on.

Before we can actually detect text on an image, we first must set up our program with the appropriate features to support OCR detection. To enable Tesseract detection in our program, we can use the following code below.

//Declare a new Tesseract OCR engine
private static Tesseract _ocr;

Here we declare a Tesseract object as a class level variable. We do this so we can modify and use the Tesseract object throughout multiple methods in our algorithm. Afterwards, using the code below, we configure certain settings for our Tesseract object to help generate the results that we want.

public static void SetTesseractObjects(string dataPath)
{
   //create OCR engine
   _ocr = new Tesseract(dataPath, "eng", OcrEngineMode.TesseractLstmCombined);
   _ocr.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ-1234567890/");
}

Using the dataPath variable, we specify where our training data file for the English language is located, of which you can download from this GitHub repo, and we specify what model was used to create it. Next, we limit what characters we want to detect, which is specified using the SetVariable and whitelist feature. It’s important to note however, that the last time I checked, the whitelist feature for Tesseract didn’t work, but I included it in the project for when the issue does get patched, we will be able to use it.

The next method that we create is used for loading in the image. Doing this is pretty easy, and we can use the following code below to load in our image in the Bgr color space.

public static Mat ReadInImage()
{
   //Change this file path to the path where the images you want to stich are located
   string filePath = Directory.GetParent(Directory.GetParent
      (Environment.CurrentDirectory).ToString()) + @"/Images/creditCard.png";

   //Read in the image from the filepath
   Mat img = CvInvoke.Imread(filePath, ImreadModes.AnyColor);

   //Return the image
   return img;
}

Once we have set up our detection and and image, we are ready to begin our credit card OCR detection. If we were to only detect text on the input image, then we would get multiple irrelevant characters and random symbols that we wouldn’t want. On top of that, currently for this project, we only want to detect the credit card numbers, and in order to do that, we must pass our image through some filters to distinguish the credit card numbers from everything else. Using the code below, we can help Tesseract out to achieve better detection results.

        private static List<Mat> ImageProccessing(Mat img)
        {
            //Resize the image for better uniformitty throughout the code
            CvInvoke.Resize(img, img, new Size(700, 500));

            Mat imgClone = img.Clone();

            //Convert the image to grayscale
            CvInvoke.CvtColor(img, img, ColorConversion.Bgr2Gray);

            //Blur the image
            CvInvoke.GaussianBlur(img, img, new Size(5, 5), 8, 8);

            //Threshold the image
            CvInvoke.AdaptiveThreshold(img, img, 30, AdaptiveThresholdType.GaussianC, ThresholdType.Binary, 5, 6);

            //Canny the image
            CvInvoke.Canny(img, img, 8, 8);

            //Dilate the canny image
            CvInvoke.Dilate(img, img, null, new Point(-1, -1), 8, BorderType.Constant, new MCvScalar(0, 255, 255));

            //Filter the contours to only find relevent ones
            List<Mat> foundOutput = FindandFilterContours(imgClone, img);

            return foundOutput;
        }

The first step in processing is to resize our image to a set standard dimensions. The advantage of this, is that if our credit card images are all the same in size, we can better control how we sort relevant pieces of our image for use later on.

After dimensions have been altered, we convert the image to grayscale. The purpose of this is that restricting color channels from three to one allows us to better control our image outcome later on. Our image converted to grayscale can be found below.

Next we blur the image, as to help eliminate any random or extra pixels on our image, while still retaining the most relevant information, like the card number, name, and expiration date. Our blurred image can be found below.

After blurring, we then threshold our image. There are a verity of different thresholding algorithms and types out there, but for our purposes, we will be using adaptive thresholding, to better distinguish relevant features but still eliminating for false positives. Our threshold image can be found below.

As you can see, the threshold image eliminated majority of the image, while still retaining most of the information that we want to keep. Although this is great, we still want to go a step further to make sure only the exact features we want to use are detected.

As you can see, the canny image is more defined on top of our threshold image, and further helps define all features that we will want to extract. Once we have these lines, we then dilate the image, as to connect some areas of the image that are in pieces, which can be found below.

After our image has been dilated, we then are ready to contour our image. The reason we passed our image through so many filters is to assist in the algorithms ability to detect contours and make it easier for us to filter them to only use relevant text that we want to detect. Using the FindandFilterContours method that we called in the above code, we declare it in the code below.

        private static List<Mat> FindandFilterContours(Mat originalImage, Mat filteredImage)
        {
            //Create a blank image that will be used to display contours
            Image<Bgr, byte> blankImage = new Image<Bgr, byte>(originalImage.Width, originalImage.Height);

            //Clone the input image
            Image<Bgr, byte> originalImageClone = originalImage.Clone().ToImage<Bgr, byte>();

            //Declare a new vector that will store contours
            VectorOfVectorOfPoint contours = new VectorOfVectorOfPoint();

            //Find and draw the contours on the blank image
            CvInvoke.FindContours(filteredImage, contours, null, RetrType.Ccomp, ChainApproxMethod.ChainApproxSimple);
            CvInvoke.DrawContours(blankImage, contours, -1, new MCvScalar(255, 0, 0));

            //Create two copys of the cloned image of the input image
            Image<Bgr, byte> allContoursDrawn = originalImageClone.Copy();
            Image<Bgr, byte> finalCopy = originalImageClone.Copy();

            //Create two lists that will be used elsewhere in the algorithm
            List<Rectangle> listRectangles = new List<Rectangle>();
            List<int> listXValues = new List<int>();

            //Loop over all contours
            for (int i = 0; i < contours.Size; i++)
            {
                //Create a bounding rectangle around each contour
                Rectangle rect = CvInvoke.BoundingRectangle(contours[i]);
                originalImageClone.ROI = rect;

                //Add the bounding rectangle and its x value to their corresponding lists
                listRectangles.Add(rect);
                listXValues.Add(rect.X);

                //Draw the bounding rectangle on the image
                allContoursDrawn.Draw(rect, new Bgr(255, 0, 0), 5);
            }

            //Create two new lists that will hold data in the algorithms later on
            List<int> indexList = new List<int>();
            List<int> smallerXValues = new List<int>();

            //Loop over all relevent information
            for (int i = 0; i < listRectangles.Count; i++)
            {
                //If a bounding rectangle fits certain dementions, add it's x value to another list
                if ((listRectangles[i].Width < 400) && (listRectangles[i].Height < 400)
                    && (listRectangles[i].Y > 200) && (listRectangles[i].Y < 300) && 
                    (listRectangles[i].Width > 50) && (listRectangles[i].Height > 40))
                {
                    originalImageClone.ROI = listRectangles[i];

                    finalCopy.Draw(listRectangles[i], new Bgr(255, 0, 0), 5);

                    smallerXValues.Add(listRectangles[i].X);
                }
            }

            //Sort the smaller list into asending order
            smallerXValues.Sort();

            //Loop over each value in the sorted list, and check if the same value is in the original list
            //If it is, add the index of the that value in the original list to a new list
            for (int i = 0; i < smallerXValues.Count; i++)
            {
                for (int j = 0; j < listXValues.Count; j++)
                {
                    if (smallerXValues[i] == listXValues[j])
                    {
                        indexList.Add(j);
                    }
                }
            }

            //A list to hold the final ROIs
            List<Mat> outputImages = new List<Mat>();

            //Loop over the sorted indexes, and add them to the final list
            for (int i = 0; i < indexList.Count; i++)
            {
                originalImageClone.ROI = listRectangles[indexList[i]];

                outputImages.Add(originalImageClone.Clone().Mat);
            }

            CvInvoke.Resize(allContoursDrawn, smallerOutput, new Size(originalImage.Width, originalImage.Height));
            CvInvoke.Imshow("Boxes Drawn on Image", smallerOutput);
            CvInvoke.WaitKey(0);

            CvInvoke.Resize(finalCopy, smallerOutput, new Size(originalImage.Width, originalImage.Height));
            CvInvoke.Imshow("Boxes Drawn on FinalCopy", smallerOutput);
            CvInvoke.WaitKey(0);

            return outputImages;
        }

Although this is a big chunk of code, I will break it down into smaller pieces to help you better understand it.

The first step is to find all contours on the image. Since we have run the image through multiple filters, we can find more distinct contours easier, where it would otherwise be more difficult. In the image below, we have drawn all found contours on a blank image.

Once the contours have been found, we then create bounding rectangles around each contour. The purpose of this is to create a more uniform approach to sorting our contours latter on, as well as allowing us to have access to coordinates for pixels and the width and height of each bounding rectangle.

As you can see, there are some parts of the image that got random contours detected, like in the parts of the image with the card holder and the expiration date. In order to only use contours we need, we must sort the relevant contours. We do this using this chunk of code below.

            //Loop over all relevent information
            for (int i = 0; i < listRectangles.Count; i++)
            {
                //If a bounding rectangle fits certain dementions, add it's x value to another list
                if ((listRectangles[i].Width < 400) && (listRectangles[i].Height < 400)
                    && (listRectangles[i].Y > 200) && (listRectangles[i].Y < 300) && 
                    (listRectangles[i].Width > 50) && (listRectangles[i].Height > 40))
                {
                    originalImageClone.ROI = listRectangles[i];

                    finalCopy.Draw(listRectangles[i], new Bgr(255, 0, 0), 5);

                    smallerXValues.Add(listRectangles[i].X);
                }
            }

Since we resized our image earlier to specific dimensions, we can use it to our advantage when we want to sort the contours. In the if statement above, we specify some certain dimensions, all of which are important to help distinguish only the card number that we want to read. Its important to note that this approach will only work for credit cards that have their numbers in the center of the card, not those like Discover, which has its numbers on the back of the card in the bottom left.

In the if statement, we make sure that the width and and height of each bounding box is less then 400 pixels. This helps eliminate some of the larger bounding boxes, like the whole credit card, the VISA logo, and the card owner. Next we check to see if the top left Y value of each bounding rectangle is greater than 200 pixels, but less then 300 pixels. This helps us further eliminate any extra bounding rectangles that could be detected at the top or bottom of the card. Lastly, just to be positive, we make sure the width of each bounding rectangle’s width and height is larger than 50 pixels and 40 pixels respectively.

In the end, if everything went smoothly, we end up with four bounding boxes, drawn on the image below.

To help format our detection better, we can use this one line of code that is built into C#, which will filter our contours along the x-axis from smallest to largest, which helps with the detection reading left to right. Once the list has been sorted, we loop over all of the found bounding boxes and check if one of our four bounding boxes is contained within the original list. If it is, we add the index of that to another list. In the end we should have two lists, one with the X values of the relevant bounding boxes, and the other with their indexes in the entire list of all bounding boxes.

//Sort the smaller list into asending order
smallerXValues.Sort();

//Loop over each value in the sorted list, and check if the same value is in the original list
//If it is, add the index of the that value in the original list to a new list
for (int i = 0; i < smallerXValues.Count; i++)
{
   for (int j = 0; j < listXValues.Count; j++)
   {
      if (smallerXValues[i] == listXValues[j])
      {
         indexList.Add(j);
      }
   }
}

Once we have found the exact bounding boxes we want we can now being the actual detection of the text on the specific parts of the image we want. Although we configured Tesseract earlier in the article, we will need to create a method to detect the text on the image.

        public static string RecognizeText(Mat img)
        {
            //Change this file path to the path where the images you want to stich are located
            string filePath = Directory.GetParent(Directory.GetParent
                (Environment.CurrentDirectory).ToString()).ToString() + @"/Tessdata/";

            //Declare the use of the dictonary
            SetTesseractObjects(filePath);

            //Get all cropped regions
            List<Mat> croppedRegions = ImageProccessing(img);

            //String that will hold the output of the detected text
            string output = "";

            Tesseract.Character[] words;

            //Loop over all ROIs and detect text on each image
            for (int i = 0; i < croppedRegions.Count; i++)
            {
                StringBuilder strBuilder = new StringBuilder();

                //Set and detect text on the image
                _ocr.SetImage(croppedRegions[i]);
                _ocr.Recognize();

                words = _ocr.GetCharacters();

                for (int j = 0; j < words.Length; j++)
                {
                    strBuilder.Append(words[j].Text);
                }

                //Pass the stringbuilder into a string variable
                output += strBuilder.ToString() + " ";
            }

            //Return a string
            return output;
        }

Here we call the configuration method we declared earlier and pass it a path to the tessdata file. Once that has happened, we then call the image processing method we created earlier and get a list of relevant regions of our image that we want to detect text from as our output. Afterwards, for each region in the list, we set the image to that instance in the list, and recognize the text in that image.

With Tesseract, we detect characters letter by letter, including white spaces. As such, we must use a string builder to add every character to our list. Once we have detected all text in the ROIs that we specified, while also adding a couple of white spaces for better formatting, we then get our output.

Here the expected output has been detected and outputted to our application. Although the process is simple and somewhat rudimentary, with some tweaking and configuration, we can make this process work for a larger array of credit cards.

While this part is up on GitHub right now, I do want to update this application in the future, with the possibility of using a camera to take a picture of a card, then doing some calculations to align the taken image to be in better proportions to are desired input image.

Below is the entire code in the OCR class that we created, and if you want to view the whole application, you can view my GitHub repo.

using Emgu.CV;
using Emgu.CV.CvEnum;
using Emgu.CV.OCR;
using Emgu.CV.Util;
using Emgu.CV.Structure;
using System;
using System.IO;
using System.Text;
using System.Drawing;
using System.Collections.Generic;

namespace Credit_Card_OCR
{
    /// <summary>
    /// Detect text on an image
    /// </summary>
    class OCR
    {
        //Proccess flow of algorithm:
        //Read in the image
        //Pass it through a veriety of filters
        //Find contours
        //Sort contours from left to right
        //Read all text in each sorted relevent ROI


        //Declare a new Tesseract OCR engine
        private static Tesseract _ocr;

        //This variable is used for debugging purposes
        static Mat smallerOutput = new Mat();

        /// <summary>
        /// Set the dictionary and whitelist for Tesseract
        /// Need to investigate if the whitelist works in later vertions of tesseract
        /// </summary>
        /// <param name="dataPath"></param>
        public static void SetTesseractObjects(string dataPath)
        {
            //create OCR engine
            _ocr = new Tesseract(dataPath, "eng", OcrEngineMode.TesseractLstmCombined);
            _ocr.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ-1234567890/");
        }

        /// <summary>
        /// Read in the image to be used for OCR
        /// </summary>
        /// <returns>A Mat object</returns>
        public static Mat ReadInImage()
        {
            //Change this file path to the path where the images you want to stich are located
            string filePath = Directory.GetParent(Directory.GetParent
                (Environment.CurrentDirectory).ToString()) + @"/Images/creditCard.png";

            //Read in the image from the filepath
            Mat img = CvInvoke.Imread(filePath, ImreadModes.AnyColor);

            //Return the image
            return img;
        }

        /// <summary>
        /// Pass the image through multiple filters and sort contours
        /// </summary>
        /// <param name="img">The image that will be proccessed</param>
        /// <returns>A list of Mat ROIs</returns>
        private static List<Mat> ImageProccessing(Mat img)
        {
            //Resize the image for better uniformitty throughout the code
            CvInvoke.Resize(img, img, new Size(700, 500));

            Mat imgClone = img.Clone();

            //Convert the image to grayscale
            CvInvoke.CvtColor(img, img, ColorConversion.Bgr2Gray);

            //Blur the image
            CvInvoke.GaussianBlur(img, img, new Size(5, 5), 8, 8);

            //Threshold the image
            CvInvoke.AdaptiveThreshold(img, img, 30, AdaptiveThresholdType.GaussianC, ThresholdType.Binary, 5, 6);

            //Canny the image
            CvInvoke.Canny(img, img, 8, 8);

            //Dilate the canny image
            CvInvoke.Dilate(img, img, null, new Point(-1, -1), 8, BorderType.Constant, new MCvScalar(0, 255, 255));

            //Filter the contours to only find relevent ones
            List<Mat> foundOutput = FindandFilterContours(imgClone, img);

            return foundOutput;
        }

        /// <summary>
        /// Find and sort contours found on the filtered image
        /// </summary>
        /// <param name="originalImage">The original unaltered image</param>
        /// <param name="filteredImage">The filtered image</param>
        /// <returns>A list of ROI mat objects</returns>
        private static List<Mat> FindandFilterContours(Mat originalImage, Mat filteredImage)
        {
            //Create a blank image that will be used to display contours
            Image<Bgr, byte> blankImage = new Image<Bgr, byte>(originalImage.Width, originalImage.Height);

            //Clone the input image
            Image<Bgr, byte> originalImageClone = originalImage.Clone().ToImage<Bgr, byte>();

            //Declare a new vector that will store contours
            VectorOfVectorOfPoint contours = new VectorOfVectorOfPoint();

            //Find and draw the contours on the blank image
            CvInvoke.FindContours(filteredImage, contours, null, RetrType.Ccomp, ChainApproxMethod.ChainApproxSimple);
            CvInvoke.DrawContours(blankImage, contours, -1, new MCvScalar(255, 0, 0));

            //Create two copys of the cloned image of the input image
            Image<Bgr, byte> allContoursDrawn = originalImageClone.Copy();
            Image<Bgr, byte> finalCopy = originalImageClone.Copy();

            //Create two lists that will be used elsewhere in the algorithm
            List<Rectangle> listRectangles = new List<Rectangle>();
            List<int> listXValues = new List<int>();

            //Loop over all contours
            for (int i = 0; i < contours.Size; i++)
            {
                //Create a bounding rectangle around each contour
                Rectangle rect = CvInvoke.BoundingRectangle(contours[i]);
                originalImageClone.ROI = rect;

                //Add the bounding rectangle and its x value to their corresponding lists
                listRectangles.Add(rect);
                listXValues.Add(rect.X);

                //Draw the bounding rectangle on the image
                allContoursDrawn.Draw(rect, new Bgr(255, 0, 0), 5);
            }

            //Create two new lists that will hold data in the algorithms later on
            List<int> indexList = new List<int>();
            List<int> smallerXValues = new List<int>();

            //Loop over all relevent information
            for (int i = 0; i < listRectangles.Count; i++)
            {
                //If a bounding rectangle fits certain dementions, add it's x value to another list
                if ((listRectangles[i].Width < 400) && (listRectangles[i].Height < 400)
                    && (listRectangles[i].Y > 200) && (listRectangles[i].Y < 300) && 
                    (listRectangles[i].Width > 50) && (listRectangles[i].Height > 40))
                {
                    originalImageClone.ROI = listRectangles[i];

                    finalCopy.Draw(listRectangles[i], new Bgr(255, 0, 0), 5);

                    smallerXValues.Add(listRectangles[i].X);
                }
            }

            //Sort the smaller list into asending order
            smallerXValues.Sort();

            //Loop over each value in the sorted list, and check if the same value is in the original list
            //If it is, add the index of the that value in the original list to a new list
            for (int i = 0; i < smallerXValues.Count; i++)
            {
                for (int j = 0; j < listXValues.Count; j++)
                {
                    if (smallerXValues[i] == listXValues[j])
                    {
                        indexList.Add(j);
                    }
                }
            }

            //A list to hold the final ROIs
            List<Mat> outputImages = new List<Mat>();

            //Loop over the sorted indexes, and add them to the final list
            for (int i = 0; i < indexList.Count; i++)
            {
                originalImageClone.ROI = listRectangles[indexList[i]];

                outputImages.Add(originalImageClone.Clone().Mat);
            }

            CvInvoke.Resize(allContoursDrawn, smallerOutput, new Size(originalImage.Width, originalImage.Height));
            CvInvoke.Imshow("Boxes Drawn on Image", smallerOutput);
            CvInvoke.WaitKey(0);

            CvInvoke.Resize(finalCopy, smallerOutput, new Size(originalImage.Width, originalImage.Height));
            CvInvoke.Imshow("Boxes Drawn on FinalCopy", smallerOutput);
            CvInvoke.WaitKey(0);

            return outputImages;
        }

        /// <summary>
        /// Detects text on an image
        /// </summary>
        /// <param name="img">The image where text will be extracted from</param>
        /// <returns>A string of detected text</returns>
        public static string RecognizeText(Mat img)
        {
            //Change this file path to the path where the images you want to stich are located
            string filePath = Directory.GetParent(Directory.GetParent
                (Environment.CurrentDirectory).ToString()) + @"/Tessdata/";

            //Declare the use of the dictonary
            SetTesseractObjects(filePath);

            //Get all cropped regions
            List<Mat> croppedRegions = ImageProccessing(img);

            //String that will hold the output of the detected text
            string output = "";

            Tesseract.Character[] words;

            //Loop over all ROIs and detect text on each image
            for (int i = 0; i < croppedRegions.Count; i++)
            {
                StringBuilder strBuilder = new StringBuilder();

                //Set and detect text on the image
                _ocr.SetImage(croppedRegions[i]);
                _ocr.Recognize();

                words = _ocr.GetCharacters();

                for (int j = 0; j < words.Length; j++)
                {
                    strBuilder.Append(words[j].Text);
                }

                //Pass the stringbuilder into a string variable
                output += strBuilder.ToString() + " ";
            }

            //Return a string
            return output;
        }
    }
}
Advertisements

1 Comments on “Creating A Credit Card OCR Application Part 1”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: