# Using a Logistic regression along with Neural Networks for Cat vs Non-Cat Image Classification

--

In this blog, we will be covering up the concepts of using the logistic regression along with neural networks, applying forward and backward propagation, and then applying them to the practice in order to build your image recognition system i.e a cat classifier in this case. This cat classifier takes an image as an input and then it predicts whether the image contains a cat or not with 70% accuracy and the tools used will be Jupyter Notebook and the code is written in python.

Let’s see a bit of theory about Neural networks and Logistic regression.

# Logistic Regression

**Logistic regression** is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary). Like all regression analyses, the logistic regression is a predictive analysis.

# How do computers interpret Images

For a computer, an image is just an array of values. Typically it’s a 3-dimensional (RGB) matrix of pixel values.

For example, a 6x 6 RGB abstract image representation would look like this.

Where each pixel has a specific value of red, green, and blue that represents the color of a given pixel. Neural networks process images using matrixes of weights called filters (features) that detect specific attributes such as vertical edges, horizontal edges, etc. Moreover, as the image progresses through each layer, the filters are able to recognize more complex attributes.

# General Architecture

Mathematical Expressions to be used:

Here L represents the Loss function

Computing cost function using the formula:

*The main steps for building a Neural Network are:*

*Define the model structure (such as number of input features)**Initialize the model’s parameters**Loop:*

*Calculate current loss (forward propagation)**Calculate current gradient (backward propagation)**Update parameters (gradient descent)*

# Step 1: Creating a new Notebook

Creating a library:

*import numpy as npimport matplotlib.pyplot as pltimport h5pyimport scipyfrom PIL import Imagefrom scipy import ndimage*

*%matplotlib inline*

# Step 2: Loading the dataset

The dataset is saved here:https://github.com/aditimukerjee/Cat-and-Non-cat-for-logistic-regression

My Github has the entire code written below: https://github.com/aditimukerjee/Cat-and-Non-cat-for-logistic-regression

`train_set_x_orig`

is a NumPy-array of shape (m_train, num_px, num_px, 3)

Each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB) and num_px is the height equal to the width of a training image. Thus, each image is square (height = num_px) and (width = num_px).

# Step 3: Analyzing the dataset

m_train = train_set_x_orig.shape[0]

m_test = test_set_x_orig.shape[0]

num_px = train_set_x_orig.shape[1]

print (“Number of training examples: m_train = “ + str(m_train))

print (“Number of testing examples: m_test = “ + str(m_test))

print (“Height/Width of each image: num_px = “ + str(num_px))

print (“Each image is of size: (“ + str(num_px) + “, “ + str(num_px) + “, 3)”)

print (“train_set_x shape: “ + str(train_set_x_orig.shape))

print (“train_set_y shape: “ + str(train_set_y.shape))

print (“test_set_x shape: “ + str(test_set_x_orig.shape))

print (“test_set_y shape: “ + str(test_set_y.shape))

`Number of training examples: m_train = 209`

Number of testing examples: m_test = 50

Height/Width of each image: num_px = 64

Each image is of size: (64, 64, 3)

train_set_x shape: (209, 64, 64, 3)

train_set_y shape: (1, 209)

test_set_x shape: (50, 64, 64, 3)

test_set_y shape: (1, 50)

# Step 3: Reshaping the dataset

We need to now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px ∗ num_px ∗ 3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.

Flattening array means converting a multidimensional array into a 1D array.

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b ∗ c ∗ d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T

Removing transpose it will be

X_flatten = X.reshape(-1, X.shape[0]) where -1 is the unspecified element of new the shape of x

Since x is originally of shape (m_train, num_px, num_px, 3) since already used x.shape[0] aka m_train , the remaining elements in the new shape should be num_px * num_px * 3 to make sure x can be reshaped so instead of specifying num_px * num_px * 3, we specify it as -1

train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1)

test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

print (“train_set_x_flatten shape: “ + str(train_set_x_flatten.shape))

print (“train_set_y shape: “ + str(train_set_y.shape))

print (“test_set_x_flatten shape: “ + str(test_set_x_flatten.shape))

print (“test_set_y shape: “ + str(test_set_y.shape))

print (“sanity check after reshaping: “ + str(train_set_x_flatten[0:5,0]))

`train_set_x_flatten shape: (12288, 209)`

train_set_y shape: (1, 209)

test_set_x_flatten shape: (12288, 50)

test_set_y shape: (1, 50)

sanity check after reshaping: [17 31 56 22 33]

To represent color images, the red, green, and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255.

train_set_x = train_set_x_flatten/255

test_set_x = test_set_x_flatten/255

# Step 4: Sigmoid function

Writing down the sigmoid function , a prerequisite for logistic regression

*def sigmoid(z):s = 1/(1+np.exp(-z))return s*

*print (“sigmoid([0, 2]) = “ + str(sigmoid(np.array([0,2]))))*

Output : sigmoid([0, 2])[ 0.5 0.88079708]

# Step 4: Initiating parameters

Initializing w as a vector of zeros. Initialize_with_zeros function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.

*def initialize_with_zeros(dim):w = np.zeros((dim, 1)) b = 0*

*assert(w.shape == (dim, 1)) assert(isinstance(b, float) or isinstance(b, int)) return w, b*

*dim = 2w, b = initialize_with_zeros(dim)print (“w = “ + str(w))print (“b = “ + str(b))*

`Output: w= [[ 0.] [ 0.]] b=0`

# Step 5: Forward and Backward Propogation

Implementing a function `propagate()`

that computes the cost function and its gradient.

def propagate(w, b, X, Y):

m = X.shape[1]

# FORWARD PROPAGATION (FROM X TO COST)

A = sigmoid(np.dot(w.T, X) + b) # compute activation

cost = -1./m* np.sum(Y*np.log(A) + (1-Y)*np.log(1-A)) # compute cost

# BACKWARD PROPAGATION (TO FIND GRAD)

dw = 1./m*np.dot(X, (A-Y).T)

db = 1./m*np.sum(A-Y)

assert(dw.shape == w.shape)

assert(db.dtype == float)

#cost = np.squeeze(cost)

#assert(cost.shape == ())

grads = {“dw”: dw,

“db”: db}

return grads, cost

*w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]])grads, cost = propagate(w, b, X, Y)print (“dw = “ + str(grads[“dw”]))print (“db = “ + str(grads[“db”]))print (“cost = “ + str(cost))*

`Output : dw=[[ 0.99845601] [ 2.39507239]] ,db=0.00145557813678 ,cost=5.801545319394553`

# Step 6: Optimization

The goal is to learn w and b by minimizing the cost function J. For a parameter θ, the update rule is θ=θ−α dθ, where α is the learning rate.

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):

costs = []

for i in range(num_iterations):

# Cost and gradient calculation (≈ 1–4 lines of code)

grads, cost =propagate(w, b, X, Y)

# Retrieve derivatives from grads

dw = grads[“dw”]

db = grads[“db”]

# update rule (≈ 2 lines of code)

w = w — learning_rate * dw

b = b — learning_rate * db

# Record the costs

if i % 100 == 0:

costs.append(cost)

# Print the cost every 100 training examples

if print_cost and i % 100 == 0:

print (“Cost after iteration %i: %f” %(i, cost))

params = {“w”: w,

“b”: b}

grads = {“dw”: dw,

“db”: db}

return params, grads, costs

params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)

print (“w = “ + str(params[“w”]))

print (“b = “ + str(params[“b”]))

print (“dw = “ + str(grads[“dw”]))

print (“db = “ + str(grads[“db”]))

`w = [[0.19033591]`

[0.12259159]]

b = 1.9253598300845747

dw = [[0.67752042]

[1.41625495]]

db = 0.21919450454067652

# Step 7 : Prediction

There are two steps to computing predictions:

- Calculate Ŷ =A=σ(wTX+b)Y^=A=σ(wTX+b)
- Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector
`Y_prediction`

. If you wish, you can use an`if`

/`else`

statement in a`for`

loop (though there is also a way to vectorize this).

# GRADED FUNCTION: predict

def predict(w, b, X):

m = X.shape[1]

Y_prediction = np.zeros((1,m))

w = w.reshape(X.shape[0], 1)

# Compute vector “A” predicting the probabilities of a cat being present in the picture

A = sigmoid(np.dot(w.T, X) + b)

for i in range(A.shape[1]):

# Convert probabilities A[0,i] to actual predictions p[0,i]

if A[0, i] > 0.5:

Y_prediction[0, i] = 1

else:

Y_prediction[0, i] = 0

assert(Y_prediction.shape == (1, m))

return Y_prediction

w = np.array([[0.1124579],[0.23106775]])

b = -0.3

X = np.array([[1.,-1.1,-3.2],[1.2,2.,0.1]])

print (“predictions = “ + str(predict(w, b, X)))

`predictions = [[1. 1. 0.]]`

# Step 8 : Merge all functions into a mode

Merging all the functions developed so far together.

def model(train_x, train_y_orig, test_x, test_y_orig, num_iterations = 3000, learning_rate = 0.6, print_cost = False):

w=np.zeros((train_x.shape[0],1))

b=0

parameters, grads, costs = optimize(w, b, train_x, train_y_orig, num_iterations, learning_rate, print_cost = False)

# Retrieve parameters w and b from dictionary “parameters”

w = parameters[“w”]

b = parameters[“b”]

# Predict test/train set examples

Y_prediction_test = predict(w, b, test_x)

Y_prediction_train = predict(w, b, train_x)

# Print train/test Errors

print(“train accuracy: {} %”.format(100 — np.mean(np.abs(Y_prediction_train — train_y_orig)) * 100))

print(“test accuracy: {} %”.format(100 — np.mean(np.abs(Y_prediction_test — test_y_orig)) * 100))

d = {“costs”: costs,

“Y_prediction_test”: Y_prediction_test,

“Y_prediction_train” : Y_prediction_train,

“w” : w,

“b” : b,

“learning_rate” : learning_rate,

“num_iterations”: num_iterations}

return d

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 5000, learning_rate = 0.8, print_cost = True)

`train accuracy: 99.99876382512535 %`

test accuracy: 72.01052229722973 %

# Step 12: Testing your model with your images

Created an image directory in the notebook and upload my images there, then assign the value of variable image in the following code as the image name.

This is a cat image and the image prediction is correct.

This is a non — cat image and the image prediction is incorrect.

This is a non-cat image and the image prediction is correct.

This is a non-cat image and the image prediction is correct.

This is a non-cat image and the image prediction is correct.

This is a cat image and the image prediction is correct.

# Conclusion

The accuracy of the training set is 99.99% and the accuracy of the testing set is 72%. Out of the six images used for prediction, one image was predicted incorrectly since the accuracy is still low and it can be further improved.

The dataset used in this model is available on my Github along with my code that is available for public use. **If you have any questions or comments or need any further clarifications please don’t hesitate to contact me at aditimukerjee33@gmail.com or reach me at 403–671–7296.** **If you are interested in collaborating on any project, feel free to reach out to me without any hesitation.**