9923170071 / 8108094992 info@dimensionless.in
A Comprehensive Guide to Data Science With Python

A Comprehensive Guide to Data Science With Python

A Hearty Welcome to You!

I am so thrilled to welcome you to the absolutely awesome world of data science. It is an interesting subject, sometimes difficult, sometimes a struggle but always hugely rewarding at the end of your work. While data science is not as tough as, say, quantum mechanics, it is not high-school algebra either.

It requires knowledge of Statistics, some Mathematics (Linear Algebra, Multivariable Calculus, Vector Algebra, and of course Discrete Mathematics), Operations Research (Linear and Non-Linear Optimization and some more topics including Markov Processes), Python, R, Tableau, and basic analytical and logical programming skills.

.Now if you are new to data science, that last sentence might seem more like pure Greek than simple plain English. Don’t worry about it. If you are studying the Data Science course at Dimensionless Technologies, you are in the right place. This course covers the practical working knowledge of all the topics, given above, distilled and extracted into a beginner-friendly form by the talented course material preparation team.

This course has turned ordinary people into skilled data scientists and landed them with excellent placement as a result of the course, so, my basic message is, don’t worry. You are in the right place and with the right people at the right time.

What is Data Science?

image result for what is data science?

To quote Wikipedia:

Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. Data science is the same concept as data mining and big data: “use the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems.”

From Source

More Greek again, you might say.

Hence my definition:

Data Science is the art of extracting critical knowledge from raw data that provides significant increases in profits for your organization.

We are surrounded by data (Google ‘data deluge’ and you’ll see what I mean). More data has been created in the last two years that in the last 5,000 years of human existence.

The companies that use all this data to gain insights into their business and optimize their processing power will come out on top with the maximum profits in their market.

Companies like Facebook, Amazon, Microsoft, Google, and Apple (FAMGA), and every serious IT enterprise have realized this fact.

Hence the demand for talented data scientists.

I have much more to share with you on this topic, but to keep this article short, I’ll just share the links below which you can go through in your free time (everyone’s time is valuable because it is a strictly finite resource):

You can refer to:

The Demand and Salary Of A Data Scientist

and an excellent introductory article below.

An Introduction to Data Science:

 

Article Organization

image result for exploring data science
From Pexels

Now as I was planning this article a number of ideas came to my mind. I thought I could do a textbook-like reference to the field, with Python examples.

But then I realized that true competence in data science doesn’t come when you read an article.

True competence in data science begins when you take the programming concepts you have learned, type them into a computer, and run it on your machine.

And then; of course, modify it, play with it, experiment, run single lines by themselves, see for yourselves how Python and R work.

That is how you fall in love with coding in data science.

At least, that’s how I fell in love with simple C coding. Back in my UG in 2003. And then C++. And then Java. And then .NET. And then SQL and Oracle. And then… And then… And then… And so on.

If you want to know, I first started working in back-propagation neural networks in the year 2006. Long before the concept of data science came along! Back then, we called it artificial intelligence and soft computing. And my final-year project was coded by hand in Java.

Having come so far, what have I learned?

That it’s a vast massive uncharted ocean out there.

The more you learn, the more you know, the more you become aware of how little you know and how vast the ocean is.

But we digress!

To get back to my point –

My final decision was to construct a beginner project, explain it inside out, and give you source code that you can experiment with, play with, enjoy running, and modify here and there referring to the documentation and seeing what everything in the code actually does.

Kaggle – Your Home For Data Science

image result for kaggle
www.kaggle.com

 

If you are in the data science field, this site should be on your browser bookmark bar. Even in multiple folders, if you have them.

Kaggle is the go-to site for every serious machine learning practitioner. They hold competitions in data science (which have a massive participation), have fantastic tutorials for beginners, and free source code open-sourced under the Apache license (See this link for more on the Apache open source software license – don’t skip reading this, because as a data scientist this is something about software products that you must know).

As I was browsing this site the other day, a kernel that was attracting a lot of attention and upvotes caught my eye.

This kernel is by a professional data scientist by the name of Fatma Kurçun from Istanbul (the funny-looking ç symbol is called c with cedilla and is pronounced with an s sound).

It was quickly clear why it was so popular. It was well-written, had excellent visualizations, and a clear logical train of thought. Her professionalism at her art is obvious.

Since it is an open source Apache license released software, I have modified her code quite a lot (diff tool gives over 100 changes performed) to come up with the following Python classification example.

But before we dive into that, we need to know what a data science project entails and what classification means.

Let’s explore that next.

Classification and Data Science

image result for classification and data science

 

So supervised classification basically means mapping data values to a category defined in advance. In the image above, we have a set of customers who have certain data values (records). So one dot above corresponds with one customer with around 10-20 odd fields.

Now, how do we ascertain whether a customer is likely to default on a loan, and which customer is likely to be a non-defaulter? This is an incredibly important question in the finance field! You can understand the word, “classification”, here. We classify a customer into a defaulter (red dot) class (category) and a non-defaulter (green dot) class.

This problem is not solvable by standard methods. You cannot create and analyze a closed-form solution to this problem with classical methods. But – with data science – we can approximate the function that captures or models this problem, and give a solution with an accuracy range of 90-95%. Quite remarkable!

Now, again we can have a blog article on classification alone, but to keep this article short, I’ll refer you to the following excellent articles as references:

Link 1 and Link 2

 

Steps involved in a Data Science Project

A data science project is typically composed of the following components:

  1. Defining the Problem
  2. Collecting Data from Sources
  3. Data Preprocessing
  4. Feature Engineering
  5. Algorithm Selection
  6. Hyperparameter Tuning
  7. Repeat steps 4–6 until error levels are low enough.
  8. Data Visualization
  9. Interpretation of Results

If I were to explain each of these terms – which I could – but for the sake of brevity – I can ask you to refer to the following articles:

How to Make Machine Learning Models for Beginners

 

and:

Steps to perform data science with Python- Medium

At some time in your machine learning career, you will need to go through the article above to understand what a machine learning project entails (the bread-and-butter of every data scientist).

Jupyter Notebooks

juypter
From Wikipedia

To run the exercises in this section, we use a Jupyter notebook. Jupyter is short for Julia, Python, and R. This environment uses kernels of any of these languages and has an interactive format. It is commonly used by data science professionals and is also good for collaboration and for sharing work.

To know more about Jupyter notebooks, I can suggest the following article (read when you are curious or have the time):

 

Data Science Libraries in Python

image result for scikit learn

The standard data science stack for Python has the scikit-learn Python library as a basic lowest-level foundation.

 

The scikit-learn python library is the standard library in Python most commonly used in data science. Along with the libraries numpy, pandas, matplotlib, and sometimes seaborn as well this toolset is known as the standard Python data science stack. To know more about data science, I can direct you to the documentation for scikit-learn – which is excellent. The text is lucid, clear, and every file contains a working live example as source code. Refer to the following links for more:

Link 1 and Link 2

This last link is like a bible for machine learning in Python. And yes, it belongs on your browser bookmarks bar. Reading and applying these concepts and running and modifying the source code can help you go a long way towards becoming a data scientist.

And, for the source of our purpose

 

Our Problem Definition

This is the classification standard data science beginner problem that we will consider. To quote Kaggle.com:

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history.  On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.

One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.

In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy.


From: Kaggle

We’ll be trying to predict a person’s category as a binary classification problem – survived or died after the Titanic sank.

So now, we go through the popular source code, explaining every step.

Import Libraries

These lines given below:

 

Are standard for nearly every Python data stack problem. Pandas refers to the data frame manipulation library. NumPy is a vectorized implementation of Python matrix manipulation operations that are optimized to run at high speed. Matplotlib is a visualization library typically used in this context. Seaborn is another visualization library, at a little higher level of abstraction than matplotlib.

The Problem Data Set

We read the CSV file:

 

Exploratory Data Analysis

Now, if you’ve gone through the links given in the heading ‘Steps involved in Data Science Projects’ section, you’ll know that real-world data is messy, has missing values, and is often in need of normalization to adjust for the needs of our different scikit-learn algorithms. This CSV file is no different, as we see below:

Missing Data

This line uses seaborn to create a heatmap of our data set which shows the missing values:

Output:

 

Interpretation

The yellow bars indicate missing data. From the figure, we can see that a fifth of the Age data is missing. And the Cabin column has so many missing values that we should drop it.

Graphing the Survived vs. the Deceased in the Titanic shipwreck:

 

Output:

As we can see, in our sample of the total data, more than 500 people lost their lives, and less than 350 people survived (in the sample of the data contained in train.csv).

When we graph Gender Ratio, this is the result.

Output

Over 400 men died, and around 100 survived. For women, less than a hundred died, and around 230 odd survived. Clearly, there is an imbalance here, as we expect.

Data Cleaning

The missing age data can be easily filled with the average of the age values of an arbitrary category of the dataset. This has to be done since the classification algorithm cannot handle missing values and will be error-ridden if the data values are not error-free.

Output

image result for output for data cleaning

We use these average values to impute the missing values (impute – a fancy word for filling in missing data values with values that allow the algorithm to run without affecting or changing its performance).

 

 

Missing values heatmap:

 

Output:

 

 

We drop the Cabin column since its mostly empty.

Convert categorical features like Sex and Name to dummy variables using pandas so that the algorithm runs properly (it requires data to be numeric)..

 

Output:

 

More Data Preprocessing

We use one-hot encoding to convert the categorical attributes to numerical equivalents. One-hot encoding is yet another data preprocessing method that has various forms. For more information on it, see the link 

 

 

Finally, we check the heatmap of features again:

 

Output

 

image result for classification model

No missing data and all text converted accurately to a numeric representation means that we can now build our classification model.

 

Building a Gradient Boosted Classifier model

Gradient Boosted Classification Trees are a type of ensemble model that has consistently accurate performance over many dataset distributions.
I could write another blog article on how they work but for brevity, I’ll just provide the link here and link 2 here:

We split our data into a training set and test set.

 

Training:

 

Output:

 

Predicting:

 

Output

 

Performance

The performance of a classifier can be determined by a number of ways. Again, to keep this article short, I’ll link to the pages that explain the confusion matrix and the classification report function of scikit-learn and of general classification in data science:

Confusion Matrix


Predictive Model Evaluation

A wonderful article by one of our most talented writers. Skip to the section on the confusion matrix and classification accuracy to understand what the numbers below mean.

For a more concise, mathematical and formulaic description, read here

 

 

So as not make this article too disjointed, let me explain at least the confusion matrix to you.

The confusion matrix has the following form:

[[ TP FP ]

[ FN TN ]]

The abbreviations mean:

TP – True Positive – The model correctly classified this person as deceased.

FP – False Positive – The model incorrectly classified this person as deceased.

FN – False Negative – The model incorrectly classified this person as a survivor

TN – True Negative – The model correctly classified this person as a survivor.

So, in this model published on Kaggle, there were:

89 True Positives

16 False Positives

29 False Negatives

44 True Negatives

Classification Report

You can refer to the link here

-to learn everything you need to know about the classification report.

 

 

So the model, when used with Gradient Boosted Classification Decision Trees, has a precision of 75% (the original used Logistic Regression).

Wrap-Up

I have attached the dataset and the Python program to this document, you can download it by clicking on these links. Run it, play with it, manipulate the code, view the scikit-learn documentation. As a starting point, you should at least:

  1. Use other algorithms (say LogisticRegression / RandomForestClassifier a the very least)
  2. Refer the following link for classifiers to use:
    Sections 1.1 onwards – every algorithm that has a ‘Classifier’ ending in its name can be used – that’s almost 30-50 odd models!
  3. Try to compare performances of different algorithms
  4. Try to combine the performance comparison into one single program, but keep it modular.
  5. Make a list of the names of the classifiers you wish to use, apply them all and tabulate the results. Refer to the following link:
  6. Use XGBoost instead of Gradient Boosting

Titanic Training Dataset (here used for training and testing):

Address of my GitHub Public Repo with the Notebook and code used in this article:

Github Code

Clone with Git (use TortoiseGit for simplicity rather than the command-line) and enjoy.

To use Git, take the help of a software engineer or developer who has worked with it before. I’ll try to cover the relevance of Git for data science in a future article.

But for now, refer to the following article here

You can install Git from Git-SCM and TortoiseGit 

To clone,

  1. Install Git and TortoiseGit (the latter only if necessary)
  2. Open the command line with Run… cmd.exe
  3. Create an empty directory.
  4. Copy paste the following string into the command prompt and watch the magic after pressing Enter: “git clone https://github.com/thomascherickal/datasciencewithpython-article-src.git” without the double quotes, of course.

Use Anaconda (a common data science development environment with Python,, R, Jupyter, and much more) for best results.

Cheers! All the best into your wonderful new adventure of beginning and exploring data science!

Image result for learning done right with data science
Learning done right can be awesome fun! (Unsplash)


Follow this link, if you are looking to learn data science online!

You can follow this link for our Big Data course, which is a step further into advanced data analysis and processing!

Additionally, if you are having an interest in learning Data Science, click here to start the Online Data Science Course

Furthermore, if you want to read more about data science, read our Data Science Blogs

7 Technical Concept Every Data Science Beginner Should Know

7 Technical Concept Every Data Science Beginner Should Know

Welcome to Data Science!

 

So you want to learn data science but you don’t know where to start? Or you are a beginner and you want to learn the basic concepts? Welcome to your new career and your new life! You will discover a lot of things on your journey to becoming a data scientist and being part of a new revolution. I am a firm believer that you can learn data science and become a data scientist regardless of your age, your background, your current knowledge level, your gender, and your current position in life. I believe – from experience – that anyone can learn anything at any stage in their lives. What is required is just determination, persistence, and a tireless commitment to hard work. Nothing else matters as far as learning new things – or learning data science – is concerned. Your commitment, persistence, and your investment in your available daily time is enough.

I hope you understood my statement. Anyone can learn data science if you have the right motivation. In fact, I believe anyone can learn anything at any stage in their lives, if they invest enough time, effort and hard work into it, along with your current occupation. From my experience, I strongly recommend that you continue your day job and work on data science as a side hustle, because of the hard work that will be involved. Your commitment is more important than your current life situation. Carrying on a full-time job and working on data science part-time is the best way to go if you want to learn in the best possible manner.

 

Technical Concepts of Data Science

So what are the important concepts of data science that you should know as a beginner? They are, in order of sequential learning, the following:

  1. Python Programming
  2. R Programming
  3. Statistics & Probability
  4. Linear Algebra
  5. Data Preparation and Data ETL*
  6. Machine Learning with Python and R
  7. Data Visualization and Summary

*Extraction, Transformation, and Loading

Now if you were to look at the above list an go to a library, you would, most likely, come back with 9-10 books at an average of 1000 pages each. Even if you could speed-read, 10,000 pages is a lot to get through. I could list the best books for each topic in this post, but even the most seasoned reader would balk at 10,000 pages. And who reads books these days? So what I am going to give you is a distilled extract on each of those topics. Keep in mind, however, that every topic given above could be a series of blog posts in its own right, and these 80-word paragraphs are just a tiny taste of each topic and there is an ocean of depth involved in every topic. You might ask if that is the case, how can everybody be a possible candidate for data scientist role? Two words: Persistence and Motivation. With the right amount of these two characteristics, anyone can be anything they want to be.

 

1) Python Programming:

Python is one of the most popular programming languages in the world. It is the ABC of data science because Python is the language every beginner starts with on data science. It is universally used for any purposes since it is so amazingly versatile. Python can be used for web applications and websites with Django, microservices with Flask, general programming projects with the standard library from PyPI, GUIs with PyQt5 or Tkinter, Interoperability with Jython (Java), Cython (C) and nearly other programming language are available today.

Of course, Python is the also first language used for data science with the standard stack of scikit-learn (machine learning), pandas (data manipulation), matplotlib and seaborn (visualization) and numpy (vectorized computation). Nowadays, the most common technology used is the Anaconda distribution, available from www.anaconda.com. Current version 2018.12 or Anaconda Distribution 5. To learn more about Python, I strongly recommend the following books: Head First Python and the Python Cookbook.

 

2) R Programming

R is The Best Language for statistical needs since it is a language designed by statisticians, for statisticians. If you know statistics and mathematics well, you will enjoy programming in R. The language gives you the best support available for every probability distribution, statistics functions, mathematical functions, plotting, visualization, interoperability, and even machine learning and AI. In fact, everything that you can do in Python can be done in R. R is the second most popular language for data science in the world, second only to Python. R has a rich ecosystem for every data science requirement and is the favorite language of academicians and researchers in the academic domain.

Learning Python is not enough to be a professional data scientist. You need to know R as well. A good book to start with is R For Data Science, available at Amazon at a very reasonable price. Some of the most popular packages in R that you need to know are ggplot2, ThreeJS, DT (tables), network3D, and leaflet for visualization, dplyr and tidyr for data manipulation, shiny and R Markdown for reporting, parallel, Rcpp and data.table for high performance computing and caret, glmnet, and randomForest for machine learning.

 

3)  Statistics and Probability

This is the bread and butter of every data scientist. The best programming skills in the world will be useless without knowledge of statistics. You need to master statistics, especially practical knowledge as used in a scientific experimental analysis. There is a lot to cover. Any subtopic given below can be a blog-post in its own right. Some of the more important areas that a data scientist needs to master are:

  1. Analysis of Experiments
  2. Tests of Significance
  3. Confidence Intervals
  4. Probability Distributions
  5. Sampling Theory
  6. Central Limit Theorem
  7. Bell Curve
  8. Dimensionality Reduction
  9. Bayesian Statistics

Some places on the Internet to learn Statistics from are the MIT OpenCourseWare page Introduction to Statistics and Probability, and the Khan Academy Statistics and Probability Course. Good books to learn statistics: Naked Statistics, by Charles Wheelan which is an awesome comic-like but highly insightful book which can be read enjoyably by anyone including those from non-technical backgrounds and Practical Statistics for Data Scientists by Peter Bruce and Andrew Bruce.

 

4) Linear Algebra

Succinctly, linear algebra is about vectors, matrices and the operations that can be performed on vectors and matrices. This is a fundamental area for data science since every operation we do as a data scientist has a linear algebra background, or, as data scientists, we usually work with collections of vectors or matrices. So we have the following topics in Linear Algebra, all of which are covered in the following world-famous book, Linear Algebra and its Applications by Gilbert Strang, an MIT professor. You can also go to the popular MIT OpenCourseWare page, Linear Algebra (MIT OCW). These two resources cover everything you need to know. Some of the most fundamental concepts that you can also Google or bring up on Wikipedia are:

  1. Vector Algebra
  2. Matrix Algebra
  3. Operations on Matrices
  4. Determinants
  5. Eigenvalues and Eigenvectors
  6. Solving Linear Systems of Equations
  7. Computer-Aided Algebra Software (Mathematica, Maple, MATLAB, etc)

 

5) Data Preparation and Data ETL (Extraction, Transformation, and Loading)

By IAmMrRob on Pixabay

 

Yes – welcome to one of the more infamous sides of data science! If data science has a dark side, this is it. Know for sure that unless your company has some dedicated data engineers who do all the data munging and data wrangling for you, 90% of your time on the job will be spent on working with raw data. Real world data has major problems. Usually, it’s unstructured, in the wrong formats, poorly organized, contains many missing values, contains many invalid values, and contains types that are not suitable for data mining.

Dealing with this problem takes up a lot of the time of a data scientist. And your data scientist’s analysis has the potential to go massively wrong when there is invalid and missing data. Practically speaking, unless you are unusually blessed, you will have to manage your own data, and that means conducting your own ETL (Extraction, Transformation, and Loading). ETL is a data mining and data warehousing term that means loading data from an external data store or data mart into a form suitable for data mining and in a state suitable for data analysis (which usually involves a lot of data preprocessing). Finally, you often have to load data that is too big for your working memory – a problem referred to as external loading. During your data wrangling phase, be sure to look into the following components:

  1. Missing data
  2. Invalid data
  3. Data preprocessing
  4. Data validation
  5. Data verification
  6. Automating the Data ETL Pipeline
  7. Automation of Data Validation and Verification

Usually, expert data scientists try to automate this process as much as possible, since a human being would be wearied by this task very fast and is remarkably prone to errors, which will not happen in the case of a Python or an R script doing the same operations. Be sure to try to automate every stage in your data processing pipeline.

 

6) Machine Learning with Python and R

An expert machine learning scientist has to be proficient in the following areas at the very least:

Data Science Topics Listing

Data Science Topics Listing – Thomas

 

Now if you are just starting out in Machine Learning (ML), Python, and R, you will gain a sense of how huge the field is and the entire set of lists above might seem more like advanced Greek instead of Plain Jane English. But not to worry; there are ways to streamline your learning and to consume as little time as possible in learning or becoming able to learn nearly every single topic given above. After you learn the basics of Python and R, you need to go on to start building machine learning models. From experience, I suggest you break up your time into 50% of Python and 50% of R and spend as much time as possible spending time without switching your languages or working between languages. What do I mean? Spend maximum time learning one programming language at one time. That will prevent syntax errors and conceptual errors and language confusion problems.

Now, on the job, in real life, it is much more likely that you will work in a team and be responsible for only one part of the work. However, if your working in a startup or learning initially, you will end up doing every phase of the work yourself. Be sure to give yourself time to process information and to spend sufficient time for your brain to rest and get a handle on the topics you are trying to learn. For more info, do check out the Learning How to Learn MOOC on Coursera, which is the best way to learn mathematical or scientific topics without ending up with burn out. In fact, I would recommend this approach to every programmer out there trying to learn a programming language, or anything considered difficult, like Quantum Mechanics and Quantum Computation or String Theory, or even Microsoft F# or Microsoft C# for a non-Java programmer.

I strongly recommend the book, Hands-On Machine Learning with Scikit-Learn and TensorFlow to learn Python for Data Science. The R book was given earlier in the section on R.

 

7) Data Visualization and Summary

Common tools that you have with which you can produce powerful visualizations include:

  1. Matplotlib
  2. Seaborn
  3. Bokeh
  4. ggplot2
  5. plot.ly
  6. D3.js
  7. Tableau
  8. Google Data Studio
  9. Microsoft Power BI Desktop

Some involve coding, some are drag-and-drop, some are difficult for beginners, some have no coding at all. All of these tools will help you with data visualization. But one of the most overlooked but critical practical functions of a data scientist has been included under this heading: summarisation. 

Summarisation means the practical result of your data science workflow. What does the result of your analysis mean for the operation of the business or the research problem that you are currently working on? How do you convert your result to the maximum improvement for your business? Can you measure the impact this result will have on the profit of your enterprise? If so, how? Being able to come out of a data science workflow with this result is one of the most important capacities of a data scientist. And most of the time, efficient summarisation = excellent knowledge of statistics. Please know for sure that statistics is the start and the end of every data science workflow. And you cannot afford to be ignorant about it. Refer to the section on statistics or google the term for extra sources of information.

How Can I Learn Everything Above In the Shortest Possible Time?

You might wonder – How can I learn everything given above? Is there a course ora pathway to learn every single concept described in this article at one shot? It turns out – there is. There is a dream course for a data scientist that contains nearly everything talked about in this article.

Want to Become a Data Scientist? Welcome to Dimensionless Technologies! It just so happens that the course: Data Science using Python and R, a ten-week course that includes ML, Python and R programming, Statistics, Github Account Project Guidance, and Job Placement, offers nearly every component spoken about above, and more besides. You don’t know to buy the books or do any of the courses other than this to learn the topics in this article. Everything is covered by this single course, tailormade to convert you to a data scientist within the shortest possible time. For more, I’d like to refer you to the following link:

Data Science using R & Python

Does this seem too good to be true? Perhaps, because this is a paid course. With a scholarship concession, you could end up paying around INR 40,000 for this ten-week course, two weeks of which you can register for 5,000 and pay the remainder after two weeks trial period to see if this course really suits you. If it doesn’t, you can always drop out after two weeks and be poorer by just 5k. But in most cases, this course has been found to carry genuine worth. And nothing worthwhile was achieved without some payment, right?

In case you want to learn more about data science, please check out the following articles:

Data Science: What to Expect in 2019

and:

Big Data and Blockchain

Also, see:

AI and intelligent applications

and:

Evolution of Chatbots & their Performance

All the best, and enjoy data science. Every single day of your life!