Data Science Competency
Traditionally speaking, interviews are the bane of the IT world and it’s no secret that most interview methods that are commonly used are flawed. They favor problem-solving skills while not measuring things much more important like teamwork skills and ability to be humble but yet competent so that they fit well into any company. There is a lot to discuss on this issue. The main reason you want to check the competency of an individual is for the hiring process, to hire new people. There is a fundamentally reasonable way to do this.
Pardon – you mean that there is a way to evaluate a fresh person accurately for technical acumen, beyond open-ended interviews? Does that go beyond interview bias? And traditional stereotyping?
Yes.
While there are methods to evaluate the competence, like the data science matrix of competency from Link, there is a need for greater objective assessment. While the data science matrix of competence – shown below:
And by competence that is conscious and unconscious, we mean the following :
– is a great tool, it is used best by people who are judging themselves. but do read the article if you are interested in this aspect of assessing data science professionals. The link is here
A more consistent and remarkable way to consistently hire great talent on a regular basis is given in this article: link
It is this article which gives an excellent way to evaluate data science abilities, directed towards the hiring process. How do they do it?
The First Way – The Take-Home Test
Ideally, you want all your candidates to have an even playing field. This means that in the first round of the interview process, you want to measure them at the same scale. This test should be:
- Explicit – nobody should have to contact you for doubts or questions
- Graded – there should be a linear increase in the difficulty level of the questions
- Short – An expert candidate should take no longer than two hours to complete all the questions
- Comprehensive – The questions should represent the skills on an average day of work
- Public – Give only publicizable data since this will be available to the general public
This component should test the following areas:
- Statistics – Make sure you cover Hypothesis Testing, the Central Limit Theorem, and Analysis of Experiments and Confidence Intervals
- Basic Mathematics – Linear Algebra, Matrix Algebra, Probability, Bayes Theorem or Conditional Probability, Basics of Differential Equations if necessary
- Coding Skills – Modularization, Functional, Documented, Testability, Usage of Modern Technology as far as choice of tools and libraries, the maturity of programming language expertise (e.g. Python generator functions)
- Methodology – How correct is the answer and is the logical process sound? Have the correct methods been used? Is the answer reproducible and clearly accurate?
- Communication Skills – How is the process explainable to someone in management? Is the communication clear enough for a layman, with the right balance of technical terms and simple explanations?
Once the candidate is through this process, you can go on to the next three steps, encapsulated in the Data Day:
The Data Day
A full day spent working beside the team on a more open-ended challenge, concluding with a presentation of their work to a group. While this is a challenging process to organize and execute, this is the ”gold standard” of the interview evaluation process. As of today, there is no better way to evaluate the candidate on the technical skill, as well as culture and work fit for the entire company.
Data Day is a simulation of a typical day at the company. Most candidates will expect a four-hour challenge, but you will have to convince them to spend an entire day at your company. So you will have to give them a sales pitch about data science at your company. The better your pitch, the more people will come to your Data Day.
Once they arrive for the Data Day, each candidate is assigned a “buddy” (contact at the organization), a laptop with prerequisite software that your company uses, and a data set. Working in conjunction with your team, each candidate has to come up with:
Direction: An output of the analysis of the data and the project requirement, they have to spend some time with the team deciding how their project as to go forward.
Execution: Each candidate has to execute on his plan and come up with a concrete, functioning analysis and preprocessing, modeling, hyperparameter tuning, and conclusion of the data science project, all the time, working in conjunction with your existing team.
Presentation: The candidate has to present his project and his findings with the dataset to the entire team and be ready to answer the questions of the entire team cogently and accurately.
Once the candidate leaves the entire team meets for a one-hour meeting about each candidate. There are two fundamental metrics:
The Second Way – Data Day – Technical Skill
From LInk
1. Problem structuring
How did the candidate structure the problem, what assumptions did the candidate make and how did the candidate narrow the scope?2. Technical rigor
How reliable, readable and flexible was the code that the candidate developed to accomplish your work? How scalable would the approach be?3. Analytical rigor
How logically sound, complete and meaningful was the approach (machine learning, statistics, analytics, visualization) that the candidate applied?4. Communication
How clearly was the candidate able to describe your work, approach, methodology and conclusions? How effectively did the candidate answer questions?5. Usefulness
If made production-worthy, how useful would the results of the candidate’s work be to the company?
The Third Way – Data Day – Cultural Fit
If there is even one single existing employee who feels lukewarm about the candidate, that candidate is rejected. In order to avoid bias, the least experienced employees speak first and the most experienced employees speak last. The candidate’s culture fit and workplace vibes and fitting with the team are discussed openly. Quoting from the original article:
Again, from First-round
When this happens for culture fit or communication reasons, it’s critical to discuss the issue openly. That helps establish and reinforce a healthy norm for how your team wants to behave, and reduces the risk of succumbing to one individual’s biases.
If everyone is lukewarm about a candidate, that is also an obvious “no.” Often this is due to limitations in their work — how much they accomplished, the rigor in their thinking, or their technical execution. If an impasse remains, then either the team leader should make a final decision (err on the side of rejection), or in rare cases, you may want to invite the candidate back for further discussion.
The Fourth Way – Running a Data Science Competition
This might be a slightly expensive way to hire and attract new talent, but it is remarkably the most wide-ranging and the most objective way to hire and attract technical skill. Of course, the main problem is that your data cannot be sensitive, it must be public and open for anyone to see. Other than this, there is no better way to go through the entire length and breadth of the online data science community to get a critical project done by offering a token sum of money as a reward. This is any day preferable to hiring a new employee from scratch. You automatically know that the top ten winning candidates have what it takes technically, so you are free to explore the other metrics, like culture fit, communication, temperament and of course, workplace attitude! Some of the possible platforms on which you can post competitions are: Kaggle
And many, many more exist online!
The Fifth Way – Plain Old Interview
While all the above methods are excellent at covering technical skill, there is a lot more to a candidate than technical skills. I found a post on Quora that mentions some really simple questions that are insignificant but reveal a lot about the candidate. These are the questions that you should ask every candidate because they are innocuous, but reveal a lot about the candidate’s mindset. The questions are:
- Do you consider yourself lucky? (checks humility over cocksureness for a candidate successful in the past)
- How was the traffic on the commute today? (checks complaining nature over a positive attitude seeing the good in everything)
- What is the opinion you have of your last boss? (checks tendency to bad-mouth and gossip instead of being prudent and honoring loyalties)
- What were the problems you had with your last team at work (checks tendency to take personal responsibility for every aspect of your project or tendency to put the blame on others)
- How much should we pay you as a salary and why? (Again checks modesty, humility, and balance of pleasant stating over the facts in the industry over tendencies to ever sell yourself and be bombastic in the greed for money)
- What are your strengths and weaknesses? (By this time, you already have a good idea but this demonstrates a candidate’s self-awareness versus candidate’s tendency to paint an overly rosy picture of themselves, because everyone has weaknesses)
- What can you tell us about our company? (Checks the research the candidate has put into the company and how much the candidate has prepared for this interview (principle of hard meaningful work)
These are my questions, I am sure you will have a lot more of your own.
Concluding Words
So these are my best ways to test the competency of data science aptitude. You will notice that the points mentioned here have a bias towards hiring employees – and rightly so since that is exactly when you want to test the skills of the employees the most. You can follow as much of these practices on a general basis as you would like. However, remember that even the best strategies can sometimes lead to candidates who might not be ideal. At such times, it is important to learn from every hire, track the employee’s performance and create a database where you track the initial impression of certain employees and the final performance of those employees. It is always important to keep learning, changing, modifying, and understanding. Keep a steady track of your progress. And you will find that your abilities to track data science competency become better as you improve your own evaluation strategies over time. All the best!
Follow this link, if you are looking to learn data science online!
Additionally, if you are having an interest in learning Data Science, learn Best online Data Science Courses to boost your career in Data Science.
Furthermore, if you want to read more about data science, you can read our Data Science blogs here.
Trackbacks/Pingbacks