Keras Tutorial for Beginners: A Simple Neural Network to Identify Numbers (MNIST Data)

The “dense” or the “fully-connected” neural network (NN) is the simplest form of neural net where a neuron in a given layer is connected to all the neurons in the previous and the next layers as shown in the below diagram.

mnist_2layers

A Dense Neural Network. Image credits: ml4a.

The dense NN can only take one-dimensional (1D) input and hence the 2D inputs like images have to be “flattened” as shown in the diagram before feeding them to the dense NN. This neural net can then be trained by “showing” it an image and “telling” it the number displayed in the image. This training process involves forward propagation followed by loss computation and backward propagation to update the weights of the neural net. Please refer A. Karpathy’s blog or Andrew Ng’s course for more details. The IPython notebook shared on my Github repository shows that the implementation of a dense neural net in Keras requires less than 10 lines of code (step 2 onwards) and obtains an accuracy of 97% (higher accuracy can be achieved by increasing “epochs” in step 5).

P.S.

  1. Here is the list of loss functions available in Keras. In general, “binary_crossentropy”, “categorical_crossentropy” and “mean_squared_error” are used for binary classification, multi-class classification and regression problems, respectively.
  2. Here is the list of activation functions. Usually, “relu” works well for hidden neurons. The “sigmoid”, “softmax” and “linear” functions are used for output neurons in binary classification, multi-class classification and regression problems, respectively.
  3. Here is the list of optimizers. “Adam” is found to work well in most cases.
  4. “accuracy” is the common metric for classification and “mean_squared_error” for regression.

A Short Beginner’s Guide to Deep Learning

Deep learning (DL) caught my attention in the beginning of 2017 and I wanted to apply it in my PhD project. I wasn’t sure whether someone without a computer science or mathematics degree can survive in this field but I decided to give it a try anyways. Long story short: it’s neither too easy nor too difficult to grasp deep learning.

DL requires decent programming skills and an affinity to math (you don’t have to be an expert) but the hardest part is to get good guidance and online resources if you prefer studying DL without enrolling for a full-time two-year degree. A novice would be overwhelmed with numerous “beginner’s guides” providing links to hundreds of courses, blogs and tutorials. 60%* of the beginners who cross this stage will manage to do some basic math and python courses/tutorials for few weeks only to conclude that they are unfit for DL. The remaining 10%* move on to advanced math or DL courses and 90%* of them would regret for taking a stab at DL. In a nutshell, one gets a stick without a carrot in sight for a really long time – resulting in a loss of confidence.

That brings me to the motivation of this blog: provide information that is just enough to get one started in DL and build confidence.  This helps a beginner to avoid making decisions (or mistakes) regarding the courses/tutorials, programming language and libraries to use for deep learning.

I prefer learning new things in a non-linear fashion as opposed to the conventional method described above. I would recommend the following steps for those who want to embark on deep learning journey°:

  1. Create google alert with key words “data science”, “AI”, “artificial intelligence”, “machine learning”, “deep learning”: reading news every day about the advances and success stories of artificial intelligence (AI) is a good motivator.
  2. Install Python, Jupyter and basic libraries like NumPy, SciPy, Matplotlib, sklearn. Do basic machine learning (ML) tutorials: Python is gaining popularity as a programming language for DL and ML [1, 2]; Jupyter makes your life easy while implementing new models in Python; Python has numerous libraries like the ones mentioned above to simplify our task. Harvard data science course has free online material to practice Python programming but it’s not necessary to go through each of them and be overwhelmed and demotivated. You can do it later in Step 7.4. Try lab2 and lab3 to get acquainted with basic libraries and lab6 and homework5 to do linear regression and classification, respectively, using sklearn. The cheatsheets for python3, numpy, matplotlib and sklearn [3, 4] might be handy.
  3. Install TensorFlow and Keras. Do tutorials: I prefer TensorFlow since it is open-source, supported by google (no fear of development cease) and is the most popular choice for deep learning. However, it is difficult for non-programmers to use and Keras comes to the rescue. Practice a couple of tutorials to perform linear regression and classification using Keras.
  4. Deep learning basics: Congrats! You can now work on deep learning projects. It would take just a week for you to reach this point instead of a year if you had followed the conventional path. Now that you have a big picture of deep learning and hands-on experience, you can go deeper. I recommend Andrew Ng’s deep learning course to get your basics right. The video lectures are really good but I can understand if you get bored of listening to them for a long time and skip them but practice the assignments!
  5. Blogs: Following blogs is a good way to get different perspectives. I recommend the blogs of A. Karpathy, C. OlahDeepMind and OpenAI.
  6. Twitter: Follow people on Twitter to get the latest DL research updates: Andrew Ng, Fei-Fei Li, Andrej Karpathy, Ian Goodfellow, Francois Chollet, Pieter Abbeel and the list goes on…
  7. Going further deep: If you have reached this stage, you have hands-on experience with Keras, know the basics of deep learning and are aware of the latest deep learning research. Just like we build deeper neural nets after trying shallow ones, it’s time to go deeper in our deep learning journey:
    1. Read the textbook Deep learning with Python by Francois Chollet: It costs around 40 bucks but it’s worth it. Chollet explains deep learning concepts in simple words.
    2. More Keras tutorials on github: Apply DL to various cases.
    3. DL pet projects: You have enough knowledge by now to decide which subfield you like in DL. Get data and start working on your own pet projects. You can find a list of datasets on KDnuggets, Wikipedia, Kaggle, crowdflower, CV datasets, deeplearning.net etc.
    4. Harvard data science course assignments and slides: You skipped this in step 2. Time to finish studying basic ML techniques.
    5. Still got some motivation? More material for basic ML techniques
    6. If you got some stamina left, try Stanford courses on computer vision and natural language processing (NLP) and UC Berkeley course on reinforcement learning.
    7. Still here? Go find that data science job!

P.S.

Deep learning required expert programming skills before 2015 but the introduction of Keras has relaxed this criteria. I believe it’s going to get lot simpler to comprehend and program deep neural networks to create new products (translator, image tagging, sentiment analysis, recommendation system etc.).

Most of the deep learning blogs I came across are long and technical which might not be good for beginners. I will write short blogs focusing on the applications of deep learning in different fields (computer vision, NLP etc.) followed by a short programming tutorial relevant to that field. Stay tuned!

 

*Disclaimer: The percentage values appearing in this blog are fictitious. Any resemblance to the real survey values, if any, is purely coincidental.
°Disclaimer: The guidance given in this article might not work for all.