Introduction
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.
Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
In this blog, we will look at why grasping theoretical knowledge is something which should never be missed while learning data science!!
The two aspects of Data Science
When it comes to knowledge there are different kinds of knowledge and different ways of acquiring each kind. On one side is theory and on the other side is the practical application of theory. Both types of knowledge are important and both make you better at whatever you do.
I think those who advance the furthest in life tend to be those who acquire knowledge at both ends of the spectrum and acquire it in a variety of ways.
Theoretical knowledge
It teaches the why and helps you understand why one technique works where another fails. Also, it shows you the whole forest, builds the context, and helps you set strategy. Where self-education is concerned theory prepares you to set a direction for your future education. Theory teaches you through the experience of others.
Theoretical knowledge can often lead to a deeper understanding of a concept through seeing it in context of a greater whole and understanding the why behind it.
Practical knowledge
It helps you acquire the specific techniques that become the tools of your trade. It sits much closer to your actual day-to-day work. There are some things you can only learn through doing and experiencing. Where theory is often taught in the idea of a vacuum, the practical is learned through the reality of life.
Practical knowledge can often lead to a deeper understanding of a concept through the act of doing and personal experience.
Both of the above are important. You won’t survive in any career unless you can bring results and to do that you need practical knowledge. There’s no avoiding it.
At the same time learning how to solve a specific problem only teaches you how to solve that same problem again. Practice can only take you so far. Theory helps you apply what you learned solving one problem to different problems.
How is theoretical knowledge important?
Solid Foundation
Giving attention to theoretical knowledge, in the beginning, helps an individual to develop a strong foundation and understanding of the subject. Practical application focus on the implementation of theoretical concepts and is important too, but directly jumping to implementation without much knowledge is one big mistake which individuals commit while learning data science.
Ability to reason and interpret the practical application
Deep understanding of the subject allows an individual to interpret results of any practical implementation. With a strong hold on theoretical concepts, you can always reason out scenarios like poor performance of your machine learning model or explaining selection of one technique over other. The practical application makes you learn how to do things but theoretical knowledge deals with “what” and “why” of the implementation
Dynamic application of rules to multiple problems
The practical implementation makes an individual learn about solving a specific kind of problem. While repeating the practical implementation, you may get skilled in solving that particular problem. But this will now allow you to solve some different problem with the same learnings. Theoretical knowledge enables an individual to understand the basic concepts of data science. With these basic concepts, the theory then can help in solving multiple other problems. You do not get confined to solve only one single kind of problem if you have strong theoretical knowledge.
No conceptual mistakes in application
If you directly jump to practical implementation, then there are high chances of committing basic conceptual mistakes. A practical implementation without proper knowledge is more like a shot in the dark. You may get it right but you are going to fail a lot more. No one will want to see all their hard work go waste because of some small gap in the concepts which you skipped or missed by not giving much attention to theoretical knowledge.
Validated application
Theoretical knowledge helps you validate your practical implementation. How do you know that the selection of an ML model for a given problem is correct? Are the results from your ML model holds any statistic significance or are just some random values? Validation can only come to you if you have a deep understanding of underlying concepts. Without these concepts, you can get skilled at doing a task but you can never validate if that was the optimised solution for the given problem or even that solution is right or wrong at first place!
Why theory in Data Science?
Vast knowledge base
Data science is a multi-disciplinary field. It involves the amalgamation of maths, business and technology. In order to be a good data scientist, you need to have a great skill at all of these. Since there is a plethora of knowledge in data science, you can not just learn by doing the practical implementation. You have to get your hands dirty with theoretical aspects of the data science and master them first. If you hurry to practical implementation, you are surely going to miss on a lot of edge cases and conceptual concepts which may lead you to make an error while doing the practical implementation.
Unexplored applications
Data sciences is a fairly new subject. There are no well-defined applications of data sciences in any domain. Every day we observe how beautifully a given problem was approached and solved using data science. Solving an unexplored problem requires a lot of theoretical knowledge which just cannot come by the practical implementation. As a data scientist, you should be able to solve any problem at hand with strong core knowledge rather than solving only a single kind of problem every time like a routine work
Constant evolution
Data science is constantly evolving and so are the concepts around it. There is a lot of research going around in data science. There are new use cases, algorithms and approach process every other day. At this phase, if you miss the theoretical aspect then you will never have updated knowledge. Regular brushup and learning of theoretical concepts is required in data science.
A lot of MATH
Data science has a lot of math aspect to it. The algorithms, the statistics, the numbers.. you can just not skip it in any form. Math is something which will come to you when you learn and implement it theoretically first. Any error in the mathematical knowledge and you will be seeing all your algorithm, analysis and insights fail right in front of you
No clearly defined roles
Currently, data science has no clearly defined roles. A data scientist is supposed to handle maths, technology and the business aspect together. But in jobs today, there is no clear balance between these 3 aspects of data science. At some firms, you need to be “math” heavy data scientist and maybe at some you need to be “tech” heavy data scientist. There are no clear demarcations currently in the industry. So if you need to find a job in data sciences, you should have a strong core knowledge. This will enable to scale up in any one aspect be it maths, business or technology quickly
When to switch modes
So you may ask me what is more important here? Theoretical knowledge or Practical application? I would say both are important and hold importance in different aspects. The key idea is to formulate your learning process in a such a way that you get enough theoretical knowledge along with proper time to implement and learn while doing the practical implementation. You can start by learning some theory then switching to the basic practical implementation of it. Once you are done with a basic implementation, come back to theory and question every step you performed during practical implementation. Analyse why those steps were required and are there other ways/means of performing a similar task and finding the most optimal method of all
Conclusion
Both are prominent at their own instances and neglecting one could lead to drastic failure in the other. Theory provides you the apt information about the experiences of others and practical knowledge helps you build your own experiences.
To simplify the topic, there is no theoretical knowledge or practical one, it’s all about formal education and self-learning. Hard knowledge and soft knowledge, both are like two distant beaches, it is you who has to find and build a strong bridge between the both.
As quoted by the great Yogi Berra, “In theory, there is no difference between theory and practice. In practice there is”. And also, “Knowledge is of no value unless you put it into practice and practice is never possible without a deep knowledge”, as said by Anton Chekhov. We all have to find a balance between the two else, the debate is of no use and has no end.
Additionally, if you are interested in learning Data Science, click here to get started
Furthermore, if you want to read more about data science, you can read our blogs here
Also, the following are some suggested blogs you may like to read