Keras Tutorial for Image Classification: A Convolutional Neural Network and its Interpretation

Convolutional neural network (CNN) has been successfully applied in many areas of computer vision and natural language processing (NLP). Details of CNN can be found in Ref. [1] but for the present discussion, it is suffice to state that a CNN is more efficient than a dense neural network [1] and learns local spatial patterns instead of the global patterns [2]. CNNs are explainable since they retain the spatial properties of the images/ sentences and learn the local patterns [2]. This is a significant advantage since deep learning models are often referred to as “black boxes” and are sometimes avoided when explainability is a key issue.

Many techniques have been proposed for visualizing and interpreting CNNs among which the most useful ones are [2]:

  1. Visualization of the outputs of intermediate convolutional layers
  2. Visualization of the filters of convolutional layers [2,3]
  3. Visualization of class-activation maps (CAM)

In this IPython notebook, I have discussed the implementation of a CNN in Keras to classify the images of CIFAR-10 dataset. I have also discussed briefly about grad-CAM, a specific form of CAM, and used it to “explain” the decisions made by my CNN model. A sample image and the interpretation of CNN using grad-CAM is shown in Fig. 1. Grad-CAM can also be used in NLP to interpret a CNN model. For instance, grad-CAM can be employed to identify the words used to make decision regarding the sentiment of a text.

cifar_gradCAM

Figure 1: Image of a car (left), the corresponding gradient-Class Activation Map (centre) and the superimposition of image & grad-CAM (right). The red ovals in the centre image indicate that the top and the back regions of the car are used by the CNN to classify the image as an “automobile”. However, the road, encircled by the blue oval, is also falsely used by the CNN for classification.

References

  1. Stanford CS231n course
  2. F. Chollet, “Deep Learning with Python”, 2017
  3. Convnet filter visualization

Sentiment Analysis of Cryptocurrencies

There is a hype regarding the investment in cryptocurrencies and investors ranging from students to hedge-fund managers are keen on making profits by riding the wave. Social media is a powerful tool capable of influencing elections, government policies and stock as well as cryptocurrency markets. In this era of social media, John McAfee’s tweet can drive up the prices of cryptocurrencies and Kylie Jenner’s tweet can cause Snapchat a loss of $1.3 billion in market value. Hence, it is critical to assess the sentiment of the public and some famous personalities while making decisions regarding investments, especially in a highly volatile cryptocurrency markets.

In these IPython notebooks, I have described the process of performing sentiment analysis, starting from labelling the data to training neural networks to application of that neural network for tracking the sentiment of public towards a given cryptocurrency (e.g. Bitcoin). In this project, I choose Coindesk over Twitter since the Coindesk articles are usually written by experts whereas any layman has access to Twitter and he/she can just tweet rubbish which would not have much affect on the prices of cryptocurrencies.

SentimentAnalysis_Cryptocurrency

The first step after scraping the data is to label them for supervised training of a neural network. It is tedious to read thousands of articles and decide whether they have positive, negative or neutral sentiment. In the first IPython notebook, I describe an automated labelling process based on the idea in Ref. [1]. The labelled data containing around 2500 news articles is used for training of a Convolutional Neural Network (CNN) in the second notebook. I tried Long Short Term Memory (LSTM) network but it takes more time to train without any improvement in the accuracy. In the third notebook, I apply the CNN model to track the sentiment of the public over time towards a given cryptocurrency (aspect-based sentiment analysis).

The accuracy of the CNN model is around 65% on the test data set and there is enough room for improvement. The accuracy can be improved by using more data or by optimizing the neural network architecture (Ref. [2]). Future work could include the study of correlation between the sentiment score and the cryptocurrency prices, identify the authors who have biased views to give less weightage to their articles, use of numerous news sources to train CNN etc.

Source code is available in this folder on my github repository.

References

  1. Francesco Pochetti’s blog
  2. Konukoii blog

 

Pandas Cheat Sheet in IPython Notebook

Machine learning involves processing huge amounts of messy data and training models, and Python is one of the most widely used programming language for the same. Pandas is a powerful library for data analysis in Python and has inherited the good qualities of Excel and Numpy. Unlike Excel, it is really simple to clean the data in Pandas and automate this process for future work. Unlike Numpy, it can handle heterogeneous data (integers, strings, dates etc.) and can assign names to the columns. It is easy to import data from popular file formats into Pandas, clean the data, do quick exploratory data analysis, plot graphs and export the data to any popular file formats. One can just the feed the values of Pandas dataframe columns as inputs to the machine learning models in scikit-learn, TensorFlow or Keras and train those models.

Pandas has numerous capabilities and it is often difficult to remember the functions or the syntax of functions. Searching for appropriate Pandas functions can be time consuming and the cheat sheets come to our rescue. I have referred various cheat sheets and compiled the most commonly used Pandas functions in a IPython notebook (available on my Github repository). Check out this book from time to time as I plan to update this in the future.

Data Analysis and Machine Learning to Improve Sales Effectiveness

Natural language processing (NLP) has various applications [1] and people are still discovering new ways to apply NLP to improve their business [2] or to have an edge over their competitors. Text classification is a subset of NLP and belongs to the category of supervised machine learning where a given text is analyzed to predict its predefined “class”. For instance, the texts “I am sad” and “It’s a sunny day!” will have predefined labels of negative and positive sentiments, respectively, and a machine learning algorithm should be able to predict the classes/labels of those texts. Common applications of deep learning (a subset of machine learning) in text classification include spam filtering on Gmail, news article classification on Google news and sentiment analysis of tweets and movie reviews. Text classification can also be used to shortlist the resumes of candidates, to improve sales effectiveness (e.g. contact only promising customers, identified by an algorithm, instead of every person on the call list), for customer relation management (e.g. sentiment analysis of customer emails to assign priority), to match a freelancer and an employer based on job description on freelancing websites and the list goes on.

I have described an approach to improve the effectiveness of sales team of an event production company using machine learning in this IPython notebook. A convolutional neural network (CNN) model is built in Keras to predict whether a person is going to attend an event based on the job title of that person. The sales team could give higher priority to people likely to attend an event and contact them first, thereby increasing their effectiveness.

The approach described in the IPython notebook can be applied to other fields/business/companies which involve text classification. Feel free to download the notebook and play with it.

References

  1.  https://machinelearningmastery.com/applications-of-deep-learning-for-natural-language-processing/
  2.  https://medium.com/xeneta/boosting-sales-with-machine-learning-fbcf2e618be3

Data Analysis and Machine Learning for the Evaluation of Startups

A venture capitalist’s decision to invest in a startup is usually based on his/her experience and intuition without much emphasis on the quantitative data. This is expected to change due to the exponential increase in the collection of data and the compute power to analyze and model this data.

It is valuable to know whether a startup can be a success (IPO or acquisition by bigger companies) or whether a startup can get funding to survive. However, it is not clear which factors influence its probability of success. Researches from MIT used the education and job histories of startup founders from LinkedIn [1,5] whereas the researchers from CMU used numerous basic, financial and managerial factors in the predictive models [2,6]. Some companies give more weightage in their model to the external factors like market, technology trend and competitors [3,4].

I performed a basic analysis of the (limited) data about the Australian startups from Crunchbase and built a simple linear regression model in scikit-learn to show a correlation between the age of a company and the amount of funding. Please have a look at this IPython notebook.

References

  1. https://motherboard.vice.com/en_us/article/zmep4y/mit-researchers-offer-algorithm-for-picking-winning-startups
  2. https://medium.com/decissio/artificial-intelligence-predicts-with-up-to-80-certainty-successful-startups-using-publicly-2acdf887ea42
  3. http://fortune.com/2015/08/05/venture-capital-hits-average/
  4. https://www.weforum.org/agenda/2017/07/computer-ai-machine-learning-predict-the-success-of-startups/
  5. 2017 MIT paper: https://arxiv.org/abs/1706.04229
  6. 2012 CMU paper: https://www.cs.cmu.edu/~guangx/papers/icwsm12-short.pdf

Trends/ Predictions about Artificial Intelligence, Machine Learning and Deep Learning for 2018

The field of machine learning (ML) is advancing rapidly due to which it is crucial for a data scientist or a machine learning engineer to read about the latest trends and be prepared for the future. It is time-consuming and sometimes confusing to go through numerous articles to get ourselves updated. After referring various resources, I have compiled a list of machine learning topics which could get a lot of attention in 2018.

Fundamental Research

Deep learning is one of the important tools of a data scientist and 2018 will witness better understanding of its theory [1]. The hot topics of 2018 will be capsule networks, generative adversarial networks (GAN), deep reinforcement learning, lean & augmented learning, meta-learning, probabilistic programming, hybrid models and artificial general intelligence (AGI) [1-4]. “Explainable AI” will also gain a lot of interest [1, 2].

Programming

Python overtook R in the field of data science and machine learning [7, 8] in 2017 and this trend might continue in 2018. TensorFlow became the popular deep learning framework in 2017 [9] and the ease of its use due to Keras could further boost its popularity in 2018. PyTorch has been gaining traction and Google reacted by introducing TensorFlow Fold.

The high demand and low supply of data scientists have driven the efforts to improve the efficiency and effectiveness of machine learning so that a data scientist can perform a variety of tasks in less time [3]. As a result, code independent ML [3] and automated ML [1-3] is expected to gain prominence in 2018 and a data scientist would rightly be valued for his expertise in the application of ML techniques instead of his/her coding abilities. This would allow people without programming background to enter the field of data science.

Machine learning is so diverse that a person could just spend his career building and optimizing neural networks in TensorFlow; the predictive analytics used in telecoms, e-commerce, banking etc. are different and yet important; the skills required to set up an instance at AWS or to maintain a database are unrelated to the skills of a data scientist [3]. Hence, the mythical “full stack data scientist” does not exist and some companies have realized it. In 2018, more companies will realize the same and stop expecting the same person to maintain a database, build neural networks and apply predictive analytics [3].

Business and Ethics

2018 will witness further increase in the applications of AI and machine learning in the fields of marketing, finance, healthcare, manufacturing, e-commerce, telecom, customer relationship management etc. [2,5,6]. The popular AI products of 2018 will be chatbots, virtual assistants, self-driving cars and advanced version of AlphaGo Zero [2,3,5,6].

Data security, privacy and ethical use of AI will be the key topics of discussion between AI companies, governments and the public [2-4].

Conclusion

AI/ML will be more mature by the end of 2018 due to swift progress in fundamental research, simplification of programming neural networks, wider applications in industries and awareness regarding data privacy and ethical use of AI.

References

  1. http://usblogs.pwc.com/emerging-technology/top-10-ai-tech-trends-for-2018/
  2. https://www.kdnuggets.com/2017/12/machine-learning-ai-main-developments-2017-key-trends-2018.html
  3. https://www.datasciencecentral.com/profiles/blogs/6-predictions-about-data-science-machine-learning-and-ai-for-2018
  4. https://venturebeat.com/2018/01/02/10-predictions-for-deep-learning-in-2018/
  5. https://venturebeat.com/2017/12/20/ai-in-2018-what-works-what-doesnt-and-whats-still-science-fiction/
  6. https://dzone.com/articles/ai-and-machine-learning-trends-for-2018-what-to-ex
  7. https://www.kdnuggets.com/2017/09/python-vs-r-data-science-machine-learning.html
  8. https://www.datasciencecentral.com/profiles/blogs/python-overtakes-r-for-data-science-and-machine-learning
  9. https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106

Keras Tutorial for Beginners: A Simple Neural Network to Identify Numbers (MNIST Data)

The “dense” or the “fully-connected” neural network (NN) is the simplest form of neural net where a neuron in a given layer is connected to all the neurons in the previous and the next layers as shown in the below diagram.

mnist_2layers

A Dense Neural Network. Image credits: ml4a.

The dense NN can only take one-dimensional (1D) input and hence the 2D inputs like images have to be “flattened” as shown in the diagram before feeding them to the dense NN. This neural net can then be trained by “showing” it an image and “telling” it the number displayed in the image. This training process involves forward propagation followed by loss computation and backward propagation to update the weights of the neural net. Please refer A. Karpathy’s blog or Andrew Ng’s course for more details. The IPython notebook shared on my Github repository shows that the implementation of a dense neural net in Keras requires less than 10 lines of code (step 2 onwards) and obtains an accuracy of 97% (higher accuracy can be achieved by increasing “epochs” in step 5).

P.S.

  1. Here is the list of loss functions available in Keras. In general, “binary_crossentropy”, “categorical_crossentropy” and “mean_squared_error” are used for binary classification, multi-class classification and regression problems, respectively.
  2. Here is the list of activation functions. Usually, “relu” works well for hidden neurons. The “sigmoid”, “softmax” and “linear” functions are used for output neurons in binary classification, multi-class classification and regression problems, respectively.
  3. Here is the list of optimizers. “Adam” is found to work well in most cases.
  4. “accuracy” is the common metric for classification and “mean_squared_error” for regression.

Matplotlib Cheat Sheet in IPython Notebook

Matplotlib is the most widely used Python library in the field of data science, machine learning and deep learning for plotting figures and visualizations. Its pyplot module provides a MATLAB-like interface [1] which makes it convenient to use for people familiar with MATLAB. Matplotlib is capable of creating a variety of plots and it is hard to remember the functions which can do those plots.

I use the IPython notebook cheat sheet on my Github repository as a quick reference instead of spending time googling for the right function. You can just bookmark this IPython notebook and refer it while working on your projects. The IPython notebook cheat sheet is a collection of commonly use matplotlib functions and provides only a brief description of each function. It is not intended to provide a complete description since google is just a click away once you what plotting function you need.

I will update this sheet when I come across other common functions.

Reference: Datacamp cheat sheet, Matplotlib docs

fig1

Fig. 1: “BASIC PLOTS”

fig2

Fig. 2: “VECTORS”

fig3

Fig. 3: “IMAGES”

fig4

Fig 4: “DATA DISTRIBUTION” (top row) and “CONTOUR-related” (bottom row)

fig5

Fig. 5: “Setting FIGURE PROPERTIES”

fig6

Fig. 6: “OTHER COMMON PLOTS” part 1

fig7

Fig. 7: “OTHER COMMON PLOTS” part 2

Top 14 Python Libraries for Machine Learning and Deep Learning

In my previous article, I suggested some simple guidelines for beginners to start their deep learning journey. In this article, I will provide a short list of the widely used python libraries in deep learning. You will be able to train deep neural nets and work on your pet projects after installing these libraries.

There are three main steps in creating a neural net model: data collection, data manipulation and training of neural net, and visualization.

  1. Data collection: Most of the deep neural nets in application today fall under the category of supervised learning. For example, we need to show numerous images of cats to a neural net and “tell” the neural net that they are cats or we need to feed numerous positive sentiment sentences (e.g. iPhone X is amazing!) and tell the neural net that these statements have positive sentiment. In order to train neural nets, you guessed it right, we need lots of data! If you’re lucky, you can get cleaned data from the sources I mentioned in my previous article. Otherwise, you will have to scrape the data from different websites and clean the data yourself. The following libraries are helpful to scrape data from websites:
    1. Requests: This should be the first choice to access the content of a webpage since it is faster.
    2. Selenium: Use Selenium if requests couldn’t do the job for you. It can be used to automate manual tasks like scrolling down a page or clicking buttons.
    3. Beautiful Soup 4: Used to extract data from the webpage obtained from Requests or Selenium.
    4. Scrapy: Most of the data scraping can be done by the above three libraries. Use Scrapy only if you need to perform advanced data scraping.
  2. Data manipulation and training of neural network: The collected data is usually cleaned and manipulated before feeding it to the neural networks. The following libraries are used for data manipulation and training of neural networks:
    1. Pandas: Good for data cleaning and manipulation. You can load data from various sources having different formats (txt, excel, json etc.) into different Pandas dataframes. You can then merge these dataframes, remove duplicate entries, handle missing values, visualize data etc.
    2. Numpy: Used to handle arrays and matrices and to perform mathematical operations on them.
    3. Scipy: Used for advanced mathematical operations like integration.
    4. Scikit-learn: Builds on top of Numpy and Scipy to provide machine learning algorithms like regression, classification, clustering etc.
    5. TensorFlow: Open-source library developed by Google to train deep neural networks.
    6. Keras: Intuitive interface to build and train deep neural networks using TensorFlow backend.
    7. h5py: Used to save Keras models.
  3. Visualization: The final step is to present the results using nice graphs:
    1. Matplotlib: It is the most widely used library for plotting graphs and visualizing data.
    2. Seaborn: Builds on top of Matplotlib and provides advanced visualizations.
    3. Bokeh: Provides interactive visualizations.

It is really simple to install Python libraries using pip. In Python3, it’s just:

pip3 install package-name

You can install the above mentioned libraries in Python3 as follows:

pip3 install requests
pip3 install selenium
pip3 install beautifulsoup4
pip3 install Scrapy

pip3 install pandas
pip3 install numpy
pip3 install scipy
pip3 install scikit-learn
pip3 install tensorflow
pip3 install keras
pip3 install h5py

pip3 install matplotlib
pip3 install seaborn
pip3 install bokeh

The libraries listed in this article are just enough to get you started with machine learning and deep learning. You might need libraries specific to a particular task in advanced projects (e.g. OpenCV for computer vision and NLTK/Gensim for NLP). You can get detailed information about data science libraries at [1, 2] and about scraping libraries here.

A Short Beginner’s Guide to Deep Learning

Deep learning (DL) caught my attention in the beginning of 2017 and I wanted to apply it in my PhD project. I wasn’t sure whether someone without a computer science or mathematics degree can survive in this field but I decided to give it a try anyways. Long story short: it’s neither too easy nor too difficult to grasp deep learning.

DL requires decent programming skills and an affinity to math (you don’t have to be an expert) but the hardest part is to get good guidance and online resources if you prefer studying DL without enrolling for a full-time two-year degree. A novice would be overwhelmed with numerous “beginner’s guides” providing links to hundreds of courses, blogs and tutorials. 60%* of the beginners who cross this stage will manage to do some basic math and python courses/tutorials for few weeks only to conclude that they are unfit for DL. The remaining 10%* move on to advanced math or DL courses and 90%* of them would regret for taking a stab at DL. In a nutshell, one gets a stick without a carrot in sight for a really long time – resulting in a loss of confidence.

That brings me to the motivation of this blog: provide information that is just enough to get one started in DL and build confidence.  This helps a beginner to avoid making decisions (or mistakes) regarding the courses/tutorials, programming language and libraries to use for deep learning.

I prefer learning new things in a non-linear fashion as opposed to the conventional method described above. I would recommend the following steps for those who want to embark on deep learning journey°:

  1. Create google alert with key words “data science”, “AI”, “artificial intelligence”, “machine learning”, “deep learning”: reading news every day about the advances and success stories of artificial intelligence (AI) is a good motivator.
  2. Install Python, Jupyter and basic libraries like NumPy, SciPy, Matplotlib, sklearn. Do basic machine learning (ML) tutorials: Python is gaining popularity as a programming language for DL and ML [1, 2]; Jupyter makes your life easy while implementing new models in Python; Python has numerous libraries like the ones mentioned above to simplify our task. Harvard data science course has free online material to practice Python programming but it’s not necessary to go through each of them and be overwhelmed and demotivated. You can do it later in Step 7.4. Try lab2 and lab3 to get acquainted with basic libraries and lab6 and homework5 to do linear regression and classification, respectively, using sklearn. The cheatsheets for python3, numpy, matplotlib and sklearn [3, 4] might be handy.
  3. Install TensorFlow and Keras. Do tutorials: I prefer TensorFlow since it is open-source, supported by google (no fear of development cease) and is the most popular choice for deep learning. However, it is difficult for non-programmers to use and Keras comes to the rescue. Practice a couple of tutorials to perform linear regression and classification using Keras.
  4. Deep learning basics: Congrats! You can now work on deep learning projects. It would take just a week for you to reach this point instead of a year if you had followed the conventional path. Now that you have a big picture of deep learning and hands-on experience, you can go deeper. I recommend Andrew Ng’s deep learning course to get your basics right. The video lectures are really good but I can understand if you get bored of listening to them for a long time and skip them but practice the assignments!
  5. Blogs: Following blogs is a good way to get different perspectives. I recommend the blogs of A. Karpathy, C. OlahDeepMind and OpenAI.
  6. Twitter: Follow people on Twitter to get the latest DL research updates: Andrew Ng, Fei-Fei Li, Andrej Karpathy, Ian Goodfellow, Francois Chollet, Pieter Abbeel and the list goes on…
  7. Going further deep: If you have reached this stage, you have hands-on experience with Keras, know the basics of deep learning and are aware of the latest deep learning research. Just like we build deeper neural nets after trying shallow ones, it’s time to go deeper in our deep learning journey:
    1. Read the textbook Deep learning with Python by Francois Chollet: It costs around 40 bucks but it’s worth it. Chollet explains deep learning concepts in simple words.
    2. More Keras tutorials on github: Apply DL to various cases.
    3. DL pet projects: You have enough knowledge by now to decide which subfield you like in DL. Get data and start working on your own pet projects. You can find a list of datasets on KDnuggets, Wikipedia, Kaggle, crowdflower, CV datasets, deeplearning.net etc.
    4. Harvard data science course assignments and slides: You skipped this in step 2. Time to finish studying basic ML techniques.
    5. Still got some motivation? More material for basic ML techniques
    6. If you got some stamina left, try Stanford courses on computer vision and natural language processing (NLP) and UC Berkeley course on reinforcement learning.
    7. Still here? Go find that data science job!

P.S.

Deep learning required expert programming skills before 2015 but the introduction of Keras has relaxed this criteria. I believe it’s going to get lot simpler to comprehend and program deep neural networks to create new products (translator, image tagging, sentiment analysis, recommendation system etc.).

Most of the deep learning blogs I came across are long and technical which might not be good for beginners. I will write short blogs focusing on the applications of deep learning in different fields (computer vision, NLP etc.) followed by a short programming tutorial relevant to that field. Stay tuned!

 

*Disclaimer: The percentage values appearing in this blog are fictitious. Any resemblance to the real survey values, if any, is purely coincidental.
°Disclaimer: The guidance given in this article might not work for all.