Hoping to start a career in tech in 2022? Don’t miss out on our last bootcamps of the year.

Apply now!

The best data science cheat sheets

Whatever your area of development, knowing how to use the most useful functions of the library you're working with is going to make your life a lot easier.

Ironhack Data Analytics

We’ve collated a collection of cheat sheets for you to get to grips with the main libraries used in data science.

They are grouped into the fields for which each library is designed: Basics, Databases, Data Manipulation, Data Visualization, Analysis, Machine Learning, Deep Learning and Natural Language Processing (NLP).


If you're just starting out in the world of data science, it's important to understand how at least two of the basic libraries work: Python and NumPy. These two libraries are used throughout the entire development process. The third library, Scipy, is a mathematical tool that can handle more complex calculations than NumPy.

Python basics

  • Level: Beginner - Intermediate
  • Area: Basics
  • Description: Python is a standard library upon which the data science methodology has been developed. The way of tackling and structuring a project is inherited from how we work in Python.
  • Source: DataQuest

NumPy basics



Data can be stored in sets or, sometimes, in relational or non-relational databases that are imported into the working platform.


  • Level: Beginner - Intermediate
  • Area: Relational databases
  • Description: relational databases use a structure of separate tables that store data more efficiently and create relations between them using keys. SQL is the best language for querying data stored in these tables, thanks to its versatility.
  • Source: sqltutorial
  • Cheat sheet: https://www.sqltutorial.org/sql-cheat-sheet/


  • Level: Beginner - Intermediate
  • Area: Non-relational databases
  • Description: non-relational databases are increasingly popular, especially due to the rise in big data companies and apps, as they make it possible to overcome the barriers of data structures posed by relational databases. MongoDB is the leader in distributed databases.
  • Source: codecentric
  • Cheat sheet: https://blog.codecentric.de/files/2012/12/MongoDB-CheatSheet-v1_0.pdf

Are you enjoying this article? Keep learning about Data Analytics!

Take the first step into tech and find out more about our Data Analytics bootcamp

Data Manipulation

Before getting started with data analytics, it's essential to organise the data set's information so that it's easier to perform the necessary analytical operations. This process is known as data manipulation.


Data Wrangling

  • Level: Beginner - Intermediate
  • Area: Data manipulation
  • Description: Prior to conducting an analysis, it's important to clean the DataFrame and organise our data, since we sometimes find duplicate, void or invalid records. The process of cleaning the DataFrame so we can use it for our analysis is known as Data Cleaning or Data Wrangling.
  • Source: pandas
  • Cheat sheet: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

Data Visualization

Data visualization is the graphic representation of data and is particularly important for conducting analyses or portraying analysis results, which can help us discover trends, outliers and patterns in the data.




  • Level: Intermediate
  • Area: Data visualization
  • Description: Within the field of visualization, maps are a very useful form of representation that allows us to depict geospacial positioning and distances. Folium is a library that allows us to generate maps and easily depict data from a data set, rendering a representation such as a mapbox or OpenStreetMap and adding layers of visual data like cluster points or a heatmap.
  • Source: AndrewChallis

Machine Learning

Machine learning algorithms allow us to make predictions based on available data. These are known either as regression or classification algorithms, depending on the type of data in question. These processes can be supervised or non-supervised, depending on whether the machine learning model is trained using labelled data, or not, which is known as 'ground truth'.


Deep Learning

Within the field of machine learning, there is a more specific field known as deep learning, which uses artificial neural networks to make predictions.



  • Level: Advanced
  • Area: Deep learning
  • Description: This is a second-generation deep learning library developed by Google. It allows users to create models using an API with an inferior or superior abstraction layer, outlining mathematical operations or neural networks, depending on the user's preference.
  • Source: Altoros
  • Cheat sheet: https://cdn-images-1.medium.com/max/2000/1*dtOZSuYDonyyBvEULpJALw.png


  • Level: Advanced
  • Area: Deep learning
  • Description: PyTorch is a deep learning library developed by Facebook. It is one of the newest libraries on the market and offers an interface for working with tensors at a more affordable price than TensorFlow or Keras, for example.
  • Source: PyTorch
  • Cheat sheet: https://pytorch.org/tutorials/beginner/ptcheat.html

Natural Language Processing (NLP)

Within the field of data science, language analysis is an area that's increasingly gaining ground, with algorithms that have been developed to help us analyse text.



 These cheat sheets contain each library's most useful functions and working methods to help you in your day-to-day development tasks. Happy Coding!

Join Ironhack

Ready to join?

+8,000 career changers and entrepreneurs launched their careers in the tech industry with Ironhack's bootcamps. Take a step forward and join the tech revolution!


What would you like to learn?


Where would you like to study?

Related blog posts about Data Analytics

Python analytics

Help data tell a story with Data Visualization and Python

SQL Databases

Learn the basics of data analytics: Intro to SQL

Panda Python

Intro to Pandas: how to manipulate Data in Python

Ironhack Online bootcamp Python

What is Python? Learn the top 3 best uses for Python programming

Alumni Testimonial Business Analyst

From Sales into Data Analytics, interview with Vincent Laduc (Senior Business Analyst at Google)

Data Analyst

What is the difference between a data engineer, a data scientist and a data analyst?

Stay up to date on our latest news and events. Sign up now!
Please type your name
Type your last name
The email is not valid. Please try again