13 Mistakes of Data Scientists...and How to avoid them!

3년 전



Hello steemians,

Today i am back with new content which is very useful for amateur data scientists.If you had chosen data science as your career then this blog is must for you.Nowadays, more and more businesses are becoming data oriented and hence the demand of data scientists is at the peak.On the top of it,every industry is facing shortfall of talent.however, being a data scientist is not easy.You need to have problem solving skills,coding skills,and other technical skills as well.If you are not from technical and mathematical background, then you will need to rely on video courses available online.But,these resources won't teach you the skills which you will need in industries.This is the reason why amateur data scientists are struggling to cope up with real-world jobs.

In this article, i will be discussing about the top mistakes which amateur data scientists make.This blog will help you to avoid pitfalls and traps on your data science journey.


1.Just learning theoretical concepts and not applying them.
2.Heading straight for Machine Learning without knowing the Prerequisites.
3.Relying only on Degrees and Certifications.
4.Assuming that ML competitions is what real-Life Jobs are.
5.Focusing on Model Accuracy over Applicability and Interpretability.
6.Using too many data science terms in Resume.
7.Using Tools and Libraries-Precedence over the business problem.
8.Giving less time on Visualizing and Exploring the data.
9.Lack of structural approach on Problem Solving.
10.Learning multiple tools together.
11.Lack of consistent study.
12.Avoiding Discussions and Competitions.
13.Lack of Communication Skills.

1.Just learning theoretical concepts and not applying them:


Its good to take the theoretical knowledge of Machine Learning.But,if you won't apply them,it will be of no use.
There is so much for you to learn such as derivations,algorithms,research papers,etc.Most of you will loose motivation and would stop working.

How to avoid this mistake?

You should maintain healthy balance between theoretical and practical learning.you need to understand that you can't learn everything in one go.you need to fill in the gaps while practicing.

2.Heading straight for Machine Learning without knowing the Prerequisites:


Majority of amateur data Scientists get motivated by the videos of untrusted source and starts opting for high end salary.But,in fact they need to run a bit longer to achieve that Goal.Its good to learn the techniques before solving a problem,because this will give you insight of how algorithm is working and how you can Fine Tune the process.

Mathematics is very important and hence you should know certain concepts of it.
If you want to get into Research Domain,then you need to know these key components,before going into that field:

  • Calculus
  • Linear Algebra
  • Statistics
  • probability

How to avoid this mistakes?

There are lot of resources,you need to learn them one by one.Build yourself taking one step at a time.

3.Relying only on Degrees and Certifications:


People rely on certificates ever since this stream of data science became popular.Now its no longer a case that it will add value to your CV.Hiring managers do not care for these piece of paper,they give much more preference to your personality and your decisions which you take in real life situations.

Its because dealing with clients,understanding data science project's life cycle and managing the deadlines,all these you need to know in order to achieve success in this domain.

How to avoid this mistake?

Certificates are valuable but only if you have a practical knowledge in the real-world situations.You need to use real world datasets and analysis reports.You should attend internships because it will help you to know how data scientists work.

4.Assuming that ML competitions is what real-Life Jobs are:


Aspiring data scientists have this misconception in their mind. Competitions provide us datasets which you can download and start working upon,even though datasets could include missing values and you could fill out the blanks by using imputation technique, the real world problems don't work like that.It consists of end to end pipeline that connects various users who work together to achieve a common goal.

You will have to work with unclear and complex data.Its not wrong to say that 90% of your time will go in cleaning and collecting the data,which becomes a part of your daily routine.Here,simpler model will gain more preference than the complex one because accuracy is not always an end goal.

how can you avoid this mistake?

The more experience you gain,the better you will be in handling situations.

5.Focusing on Model Accuracy over Applicability and Interpretability:


As mentioned above,accuracy isn't the end goal always.If you try to teach your client how you made it up to 95% accuracy,he may not understand it and would reject your model.It's just not possible to teach a client about neural networks,convolution layers,etc.You should be focusing on learning how it internally works,then only you can cater the needs of the client.

Also,if your model fits in an organisation's framework and if you have used too many tools and libraries,then it will fail because environment may not be compatible for such model.then,you will need to redesign your model with my simpler approach from the scratch.

How can you avoid this mistake?

There is no better teacher than experience.Hence, you need to talk with people working in that industry.Also,practice making simpler models and then explain it to non-technical people.This will teach you where to stop and how effective these simpler models are in real life applications.

6.Using too many data science terms in Resume:

graduation-907565_1920 (1).jpg

If your resume has this problem,then rectify it now!Simply listing all your tools may turn good hiring managers away.Your resume tells others that what and how you have accomplished it.

If your resume is too big or it contains just the scientific terms like LightGBM,regression,etc. then there is a fair chance that your resume will be rejected in the screening round itself.

How you can avoid this mistake?

Simplest way to remove this mistake is to use bullet points,only list down your projects or accomplishments and write a line about how you did that because it helps recruiter to understand your thinking.If you are a fresher,then your resume should reflect what potential benefit you can give to the company in return.

7.Using Tools and Libraries-Precedence over the business problem:


Imagine you have got a dataset on house prices and you need to predict the value of future real estates.There are over 200 variables and you may not know why a variable was dropped and even though you lack this information,you will be
building a model with good accuracy.

Now,if it turns out that the variable that you dropped was crucial in real world analysis,this will lead to a big mistake.Having a solid knowledge of tools and libraries is excellent and will add value to your profile.But,the real data scientist steps in when they integrate their knowledge with the real life business problem.

How you can avoid this mistake?

Read on how companies in your domain use data science and try to get datasets of specific companies and start working on them,as it will add unique value to your resume.

8.Giving less time on Visualizing and Exploring the data:


Data Visualization holds good importance in data science.Most of the amateur data scientists skips this and jumps to model building.this approach might work in competitions but would fail to work in real world.by spending time on this,you will get to know much about the internal working mechanism.

The more curious you are,the more questions you will ask and this will increase your level of understanding.

How you can avoid this mistake?

Spend time on this step,ask questions and practice more.

9.Lack of structural approach on Problem Solving:


Structured thinking helps in these ways:

  • Helps client to understand our framework better.
  • Breaks down problem statements into logical blocks.
  • Helps in planning out and designing the approach.

Not having structured approach will make your model complex and less user friendly and would ultimately lead to its rejection.In data science interview,you will be given case studies,guess estimates and and some puzzle problems.

Here,because of time constraint,interviewer will look the approach,structure of thoughts,etc. which you would have used to solve the problem.

How you can avoid this mistake?

  • Acquire a structured oriented thinking through training and tests.
  • Have a disciplined and simple approach.

10.Learning multiple tools together:


Which one should you learn?

  • SAS
  • R
    As all these provide unique features,people use to learn all at once and hence end up mastering none of them.Tools are needed to perform the task,they are not the end goal.

How you can avoid this mistake?

Pick one and achieve mastery over it.The one which you have started learning,first complete it and then head back to the next one.This approach will give you the best return.

11.Lack of consistent study:


Not just for data scientists,whichever discipline in which you are working, consistency is must!Until and unless you are consistent,you can't remember your past concepts and notes.We give excuses due to our busy routines,it's eventually our loss.

If being a data scientist was easy,then everybody would be a data scientist today.It requires a lot of patience and consistency.

How you can avoid this mistake?

Set your goals,plan your working schedule,start your work and be consistent...that's it!

12.Avoiding Discussions and Competitions:


Amateur data scientists tends to shy to make their model public and its a big social concern.Until and unless you share your models,you won't be able to know your standard,errors,popularity,etc. and most important, you won't get feedback which is of utmost importance of every data scientist.

In this field,brainstorming,discussions and feedback plays the most important role.

How you can avoid this mistake?

Start participating in Group Discussions and competitions. Even if you won't come to the top, but you will definitely learn new information which will help you grow with much faster pace.

13.Lack of Communication Skills:


A data scientist must possess this quality in order to effectively represent his model.Not just clients,you will need to effectively communicate with your teammates who are not so experienced as you are in data science.Even interviewers will take a note of your communicative abilities.

Also,you have to make your clients and other non-technical members understand your model,then only you will get good return of your work.

You don't have a choice,you need to polish your personality.

How you can avoid this mistake?

Explain data science to non-technical people and practice regularly as it's the key to success.
Start doing these from today.

End note:

There are plenty of other mistakes which data scientists make.But,these were the most common ones.Try to avoid these mistakes!


I would love to hear your thoughts and your personal experiences.Use the comment section below to let me know.


Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  trending

Good work my friend - Really impressed !


Sir,thanks for your upvote and comment...keep visiting my profile and guide me to grow with you!


Ya bro you are doing great work

Posted using Partiko Android


Sir,i want to talk with you...there are many ideas and questions on which i want to discuss with u... please reply me on messenger or send me your email... anything...just tell me how could i contact u.

@ankit-singh brother you said right that with out practical knowledge , bookish knowledge is less useful . So every one should try apply your learned knowledge in their respective field.

Thanks nice article.


Thanks bro...keep supporting to get support....


Good article

Nice article about data science

wow brother you solved mistery lol 😍

@ankit-singh this is the best article I have read on steemit till date

maybe because I am from data field I liked it but the post is really well written and provides great value to the readers.


Thanks for your wonderful words....your blogs are also awesome...keep supporting...
have a good day!

WARNING - The message you received from @mrglowz is a CONFIRMED SCAM!
DO NOT FOLLOW any instruction and DO NOT CLICK on any link in the comment!

For more information about this scam, read this post:

If you find my work to protect you and the community valuable, please consider to upvote this warning or to vote for my witness.


A ile masz lat?


Text in English


Nie rosumiem angielskiego powtusz po polsku.


Hyy @cleverbot i am inviting you to comment on my posts.


I haven't bought him a gift!


Go and upvote on my blogs...its an order


I have blood and skin and memories.

An article so detailed covering all the aspects of the topic, kudos to you bro @ankit-singh for inspiring us all, to believe that good content will be recognized on the platform.


Its all because of teammates like you all....
Thanks for supporting....lets grow together!


Always bro, we will support good content creators :D

Posted using Partiko Android

Sneaky Ninja Attack! You have just been defended with a 8.07% upvote!
I was summoned by @ankit-singh. I have done their bidding and now I will vanish...

A portion of the proceeds from your bid was used in support of youarehope and tarc.

Abuse Policy
How to use Sneaky Ninja
How it works
Victim of grumpycat?

This post has received a 9.81 % upvote from @boomerang.

Brilliant content @ankit-singh

Full fledged research and well written

Hi, first of all nice post and thanks for sharing. Recently I have started learning Big Data but still I am not getting proper videos and books to refer to. If possible kindly guide me in finding good books, notes or videos so that I can excel in this career. Thanks and Regards