5 tips for every Data Science project

Luca Pessina
Nerd For Tech
Published in
3 min readAug 20, 2021

--

I report 5 very useful tips from my personal experience as data scientist.

1. Focus on the data

It can seem a silly advice but every time I start a new data science project it’s hard not to start right away bulding models. Indeed it is very useful, and correct, to have a look at the data. Every minute you spend on the data is an investment for two main reasons. The first is that you need to understand the data, how they are, the main features and the type of data you have. The second is to avoid future errors, if you start immediatly analyzing the data bulding models you could miss some important characteristics or maybe, even worse, make some fake assumptions.

2. Make easy plots

Plotting data might seem easy, but is not! Nowadays it is plenty of libraries that allow you to make enough of everything, it doesen’t matter if you use R, Python of Tableau, the basics of data visualizations are the same. The plots you do should be different depending on the circumstances you encounter. It can be counterproductive to show very complex plots to your business boss, you spend 10 minutes to explain, he doesn’ t understand and you are done with him. Start using very simple plots and then if you have a very specific audience you can add some complexity and show your skills.

3. Use simple methods

You just studied the last neural network implementation and you want to apply it on your new project as fast as you can, don’t do that!

Complexity doesen’t mean utility or performing, you always need the right method for the right problem. Be sure you understand the problem, and start with the simpler useful method to solve it. It can be a linear model like a linear regression or a logistic regression. It is crucial not to use a complex model but a useful model! A too complex model, in the early stages of the project, can also bring you towards false results and overstimate the information available. Remember, the data give the information, not the methods.

4. Reach an end

I alway look at some project and they look pretty endless. That’s not the point!

In every data science project there are always some questions to answer, once you have done with them, end your work. Is is’t useful and professional to continue your work without reaching an end. I know it is always possible to implement a new method, to add new data or optimize some aspects. My advice is to reach the point, and then, if you have some time left try different approaches and techniques. An endless project is just a not finished work, not a note of merit.

5. Useful results

The basis is that the results must be correct, under some assumptions.

Is very important that the questions you want to answer are correct, and that it is possible to answer with the data given. Sometimes is not possible to reach the results we wanted at the beginning, never mind, that’s a useful result! Try to understand if the results you have are authentic or they are forced by the models you used in the analysis. The results your boss wants must be useful in a pratical way, and not just a technical exercise.

Thank you for the reading!

--

--

Luca Pessina
Nerd For Tech

Data Science student and start-up enthusiastic. I write about data science, artificial intelligence and everyday lifestyle. Enjoy!