With the growth of Data science in recent years, we have seen a growth in the development of the tools for it. R and Python have been steady languages used by people worldwide. But before R and Python, there was only one key player and it was MATLAB. MATLAB is still in usage in most of the academics areas and mostly all the researchers throughout the world use MATLAB.
In this blog, we will look at the reasons why MATLAB is a good contender to R and Python for Data science. Furthermore, we will discuss different courses which offer data science with MATLAB.
What is MATLAB?
MATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation.
It is a programming platform, specifically for engineers and scientists. The heart of MATLAB is the MATLAB language, a matrix-based language allowing the most natural expression of computational mathematics.
Typical uses include:
Math and computation
Algorithm development
Modelling, simulation, and prototyping
Data analysis, exploration, and visualization
Scientific and engineering graphics
Application development, including Graphical User Interface building
The language, apps, and built-in math functions enable you to quickly explore multiple approaches to arrive at a solution. MATLAB lets you take your ideas from research to production by deploying to enterprise applications and embedded devices, as well as integrating with Simulink® and Model-Based Design.
Features of MATLAB
Following are the basic features of MATLAB −
It is a high-level language for numerical computation, visualization and application development
Provides an interactive environment for iterative exploration, design and problem-solving.
Holds a vast library of mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimization, numerical integration and solving ordinary differential equations.
It provides built-in graphics for visualizing data and tools for creating custom plots.
MATLAB’s programming interface gives development tools for improving code quality maintainability and maximizing performance.
It provides tools for building applications with custom graphical interfaces.
It provides functions for integrating MATLAB based algorithms with external applications and languages such as C, Java, .NET and Microsoft Excel.
Why Use MATLAB in Data Science?
Physical-world data: MATLAB has native support for the sensor, image, video, telemetry, binary, and other real-time formats. Explore this data using MATLAB MapReduce functionality for Hadoop, and by connecting interfaces to ODBC/JDBC databases.
Machine learning, neural networks, statistics, and beyond: MATLAB offers a full set of statistics and machine learning functionality, plus advanced methods such as nonlinear optimization, system identification, and thousands of prebuilt algorithms for image and video processing, financial modelling, control system design.
High-speed processing of large data sets. MATLAB’s numeric routines scale directly to parallel processing on clusters and cloud.
Online and real-time deployment: MATLAB integrates into enterprise systems, clusters, and clouds, and can be targeted to real-time embedded hardware.
Also, MATLAB finds its features available for the entire data science problem-solving journey. Let us have a look at how MATLAB fits in every stage of a data science problem pipeline
1. Accessing and Exploring Data
The first step in performing data analytics is to access the wealth of available data to explore patterns and develop deeper insights. From a single integrated environment, MATLAB helps you access data from a wide variety of sources and formats like different databases, CSV, audio, video etc
2. Preprocessing and Data Munging
When working with data from numerous sources and repositories, engineers and scientists need to preprocess and prepare data before developing predictive models. For example, data might have missing values or erroneous values, or it might use different timestamp formats. MATLAB helps you simplify what might otherwise be time-consuming tasks such as cleaning data, handling missing data, removing noise from the data, dimensionality reduction, feature extraction and domain analysis such as videos/audios.
3. Developing Predictive Models
Prototype and build predictive models directly from data to forecast and predict the probabilities of future outcomes. You can compare machine learning approaches such as logistic regression, classification trees, support vector machines, and ensemble methods, and use model refinement and reduction tools to create an accurate model that best captures the predictive power of your data. Use flexible tools for processing financial, signal, image, video, and mapping data to create analytics for a variety of fields within the same development environment.
4. Integrating Analytics with Systems
Integrate analytics developed in MATLAB into production IT environments without having to recode or create custom infrastructure. MATLAB analytics can be packaged as deployable components compatible with a wide range of development environments such as Java, Microsoft .NET, Excel, Python, and C/C++. You can share standalone MATLAB applications or run MATLAB analytics as a part of the web, database, desktop, and enterprise applications. For low latency and scalable production applications, you can manage MATLAB analytics running as a centralized service that is callable from many diverse applications.
People these days use MATLAB only when they need to create a quick prototype and then for doing trial and error for validating a fresh concept. The real implementation will never be made with MATLAB but with python, c++ or a similar language. In my opinion MATLAB and python (or python libs) serve for different purposes. Scripting is just one feature out of thousands of features in MATLAB but it is the only feature in python. People use both python and MATLAB scripts where in some other faculties people rely on only MATLAB toolboxes with zero scripting. Hence both python and MATLAB will exist in future but most probably the usage of MATLAB “outside” may be reduced. MATLAB will exist until we have a better alternative of it.
Summary
MATLAB provides a lot of inbuilt utilities which one can directly apply in data science. Furthermore, MATLAB today finds it’s heavy usage in the field of academics and research. Although languages like R and Python are dominating data science worldwide, they are no way near to the simplicity level which MATLAB has to offer. Also, MATLAB will go a long way in the field of data science in the years to come. Additionally, learning MATLAB will be a great bonus for those who are willing to pursue a career in research!
Also, follow this link, if you are looking to learn more about data science online!
Among researchers, there is a growing interest in conceptualizing complex problems. It requires using a system framework and using systems modelling tools to explore how components of a complex problem interact. In particular, system simulation approaches are useful tools for understanding the processes and structures involved in complex problems. Alo, identifying high-leverage points in the system and evaluating hypothetical interventions becomes easier.
One tool that has extensive usage in among researchers is agent-based modelling (ABM). We define traits and initial behaviour rules of an agent that organize their actions and interactions. Stochasticity plays an important part in determining which agents interact and how agents make decisions.
Before going into too much depth, let us first have a look at two basic modelling approaches. After that, we can further delve into the world of agent-based modelling.
Types of modelling approaches
Equation-based modelling
It builds on the interrelation of a set of equations that captures the variability of a system over time (ordinary differential equations — ODEs) or over time and space (partial differential equations — PDEs) An example would provide the development of pressure in a box. EBM does not aim at representing the micro-level behaviour of individual agents in the first place (e.g. the velocity of individual gas particles in the box). Therefore, EBM tends to focus the modeller’s attention on the overall behaviour of the system. Its basic constituencies are levels and flow rates, and not so many individual components.
Usually, we validate EBM on the systems level by comparing model output with real system behaviour. Since the behaviour of individual components is not explicitly in its focus, we do not validate it on this level.
Furthermore, EBM (when restricted to ODE-methods like System Dynamics) has no intrinsic option for representing space (PDEs provide parsimonious options for modelling physical space, but not the interaction space of individual agents).
Several intuitive drag-and-drop tools for constructing and analyzing system dynamics models exist (Stella, Powersim, Simulink (of Matlab) or VENSIM) which makes EBM relatively easy to use and therefore extensive usage and deployment.
In sum, EBM seems well-suited to represent physical processes (or processes that can be seen as such without loss). It suggests regarding a system as a whole in the first place and does not support an explicit representation of components (agents). To some extent hence, EBM is a type of top-down technology. It is most naturally applicable to systems which we can model centrally, and in which physical laws govern dynamics rather than by information processing.
Agent-based modelling
It usually starts out with modelling properties and behaviour of individual agents and only thereafter considers macro-level effects to emerge from the aggregation of agents’ behaviour. In ABM, the individual agent is the explicit subject to the modelling effort.
With this, ABM offers an additional level of validation. Like EBM it allows comparing model output with observed system behaviour. Additionally, however, it can be validated at the individual level by comparing the encoded behaviour of each agent with the actual behaviour of real agents. This, however, usually requires additional data and hence more efforts in empirical research.
Basically, ABM might seem intuitively more appropriate for modelling social systems, since it allows, and even necessitates, considering individual decisions, dispositions and inclinations. Its natural modularization follows boundaries among individuals, whereas in EBM modularization often crosses these boundaries.
What is more, ABM allows representing space, thereby offering possibilities to consider topological particularities of interaction and information transfer. In combination with graph theory and network analysis, it enables precise conceptualizations of differences in frequency, strength, existence etc. of interactions between agents.
What is Agent-Based Modelling
Agent-based modelling (ABM) is a style of modelling in which we represent the interaction between individuals and with each other environment in a program. Agents can be, for example, people, animals, groups, or cells. They can model entities that do not have a physical basis but are entities can perform some tasks such as gathering information or modelling the evolution.
It is a method of modelling complex systems by defining rules and behaviours for individual components (agents) as well as the environment they are present in. Further, we aggregate these rules to see the general behaviour of the system. It helps in understanding how simple micro-rules of individual behaviour emerge into macro-level behaviour of a system. Being able to model these complex systems can lead to a better understanding of them, thereby being able to control the course of events, just by tweaking simple rules at the individual level.
Entities in Agent-Based Modelling
ABM contains autonomous models called agents. These agents can be an individual, a group of individuals or even an organisation. Each agent is defined with properties of its own along with relationships with other agents. Apart from agents, ABMs also have environments, which is a set of conditions the agent is exposed to. Once these entities are defined in an ABM, individual behavioural rules of how an agent would behave in a given environment is defined. An aggregate of these simple individual-level behaviours that lead to a complex macro level pattern. A feedback-based learning-model is often used in ABMs to update agent actions based on their changing relationships with other agents and their environment.
Why Agent-Based modelling
Conventional models take into consideration only factors externals to their components to decide their actions. This has the drawback of not modelling the “big picture” entirely. An ABM, not only considers the external factors, but also one component’s interaction with other components, to decide their actions. Thus, it includes the possibilities of these interactions impacting the actions as well. Also, as stated by William Rand, Consumers modelled with ABM can be boundedly rational, heterogeneous in their properties and actions, adaptive and sensitive to history in their decisions, and located within social networks or geographical locations. This makes simulations of situations using ABMs more representative of the real world.
An Example of Agent-Based Models
This example simulates the spread of a fire through a forest. It shows that the fire’s chance of reaching the right edge of the forest depends critically on the density of trees. This is an example of a common feature of complex systems, the presence of a non-linear threshold or critical parameter.
Fire Model
The fire starts on the left edge of the forest and spreads to neighbouring trees. The fire spreads in four directions: north, east, south, and west.
The model assumes there is no wind. So, the fire must have trees along its path in order to advance. That is, the fire cannot skip over an unwooded area (patch), so such a patch blocks the fire’s motion in that direction.
When you run the model, how much of the forest burns. If you run it again with the same settings, do the same trees burn? How similar is the burn from run to run?
Each turtle that represents a piece of the fire is born and then dies without ever moving. If the fire comprises of turtles but no turtles are moving, what does it mean to say that the fire moves? This is an example of different levels in a system: at the level of the individual turtles, there is no motion, but at the level of the turtles collectively over time, the fire moves.
Tools for Agent-Based Modelling
Amp. The AMP project provides extensible frameworks and exemplary tools for representing, editing, generating, executing and visualizing agent-based models (ABMs) and any other domain requiring spatial, behavioural and functional features.
Ascape. An innovative tool for developing and exploring general-purpose agent-based models.
Breve. A free, open-source software package which makes it easy to build 3D simulations of multi-agent systems and artificial life.
GAMA is a simulation platform, which aims at providing field experts, modellers, and computer scientists with a complete modelling and simulation development environment for building spatially explicit multi-agent simulations.
MASON is a fast discrete-event multiagent simulation library core in Java, designed to be the foundation for large custom-purpose Java simulations.
MASS is a Multi-Agent Simulation Suite consists of four major components built around a simulation core.
MetaABM. Supports a high-level architecture for designing, executing and systematically studying ABM models.
NetLogo. A cross-platform multi-agent programmable modeling environment.
Player/Stage. Free Software tools for robot and sensor applications.
PS-I is an environment for running agent-based simulations. It is cross-platform, with binaries available for Win32.
Repast. A free and open source agent-based modelling toolkit that simplifies model creation and use.
Problem Description
The “Heroes and Cowards” game, also called the “Friends and Enemies” game or the “Aggressors and Defenders” game dates back to the Fratelli Theater Group at the 1999 Embracing Complexity conference, or perhaps earlier.
In the human version of this game, each person arbitrarily chooses someone else in the room to be their perceived friend, and someone to be their perceived enemy. They don’t tell anyone who they have chosen, but they all move to position themselves either such that a) they are between their friend and their enemy (BRAVE/DEFENDING), or b) such that they are behind their friend relative to their enemy (COWARDLY/FLEEING).
This simple model demonstrates an idealized form of this game played out by computational agents. Mostly it demonstrates how rich, complex, and surprising behaviour can emerge from simple rules and interactions.
Next Steps
In the next blog of this series, we will be implementing the above-mentioned problem. We will use NetLogo for modelling the above problem. Keep watching this space for the next part!
Conclusion
ABM offers the behavioural sciences a computational toolkit for developing precise and specific models. It focuses on how individuals interact and discover patterns of behaviour and organization that emerge from these interactions. Instead of relying on verbal theories, we can now build ABMs of the phenomena which we want to understand. We can test these models against data. Depending upon the degree they successfully agree with the data, we can achieve a deeper understanding of any phenomena.
Additionally, if you are interested in learning Data Science, click here to get started
Furthermore, if you want to read more about data science, you can read our blogs here
Also, the following are some suggested blogs you may like to read