As a data scientist, you most likely want to share the helpful modules you created with your teammates or other users. Although your module might be useful, others will not use it if it takes them a lot of effort to access the useful functions in your module.
Thus, you want to make it easy for users to use your module. The code to import and run your module should be short. In this article, I will show you 3 ways to make it easy to import and execute your Python modules.
Imagine we have a file called utils.py
that…
Did you create some useful functions or classes that you keep reusing again and again in multiple projects? Instead of copying and pasting the same code, wouldn’t it be convenient if you could just import your useful functions like below in any of your projects?
To make your package accessible to you and others, publish to PyPI. I’m sure you have come across PyPI when trying to install some Python packages. Anybody can upload their package to PyPI, including you.
Does your package need to be complicated to upload to PyPI? No! …
After developing a useful Python package or a useful function, you might wish for your teammates or other users to use your code. However, the output of your code in the terminal is a little bit boring and is confusing to understand.
Is there a way that you can make the output clearer and prettier like below?
That is when Rich comes in handy. Besides beautiful printing, Rich also allows you to:
If you want to build a deep learning app, how difficult can it be? It depends on the tools you use to build the model and application. If you want to create a deep learning model with high performance and deploy it in several lines of code, start with fastai and Streamlit.
Inspired by chapter 2 of the book, Deep Learning for Coders with fastai and PyTorch by Jeremy Howard and Sylvain Gugger, I decided to create a model to classify dogs and deploy it using Streamlit instead of Voila.
In this tutorial, I will show you how to create…
If you are working in a team, there might be multiple projects within your team. There are also subtasks with due dates for each project. Each subtask is assigned to different team members.
The schedule is even more complicated if every member has different vacation dates. Is there a way that you can visualize all these details like below using Python?
Yes, you can with python-gantt.
A Gantt chart is a bar chart that provides a visual view of tasks scheduled over time. The x-axis represents the timeline, and the y-axis presents the tasks.
There are multiple libraries in Python…
Have you ever looked at a function you wrote one month earlier and found it difficult to understand in 3 minutes? If that is the case, it is time to refactor your code. If it takes you more than 3 minutes to understand your code, imagine how long it would take for your teammates to understand your code.
If you want your code to be reusable, you want it to be readable. Writing clean code is especially important to data scientists who collaborate with other team members in different roles.
You want your Python function to:
If you are using print to debug your code, you might find it confusing to look at many lines of output on your terminal and then try to figure out which code each output belongs to.
For example, running the script below
will give you
30
40
Which one of these outputs is num1
? Which one of these outputs is num2
? Two outputs might not be so bad to figure out, but what if there are more than 5 different outputs? To try to find the source code that is responsible for the output can be time-consuming.
…
Have you ever tried to have a general understanding of your CSV file by staring at it but ended up not understanding your file? You can open a Jupyter Notebook to analyze your CSV file, but it is time-consuming to open a notebook just to understand a CSV file, especially when you are working primarily with Python scripts and terminal.
Is there a way that you can quickly analyze your CSV files from your terminal in 1 line of code such as this?
$ xsv stats bestsellers.csv | xsv table
This is when xsv comes in handy.
xsv is…
If you are given a tweet, can you recognize which gender it belongs to? You probably can recognize the gender of the author by looking at specific words in a tweet.
For example. if you see the word ‘cute’ in a tweet, there is a high percentage that the author is female. Because some words are used more often by a certain gender, it is possible for machine learning models to distinguish between different genders using these gender-related words.
Wouldn’t it be interesting if we can visualize how different words are related to different genders on Twitter? That could be…
If you are working with Python, you probably print the output on the terminal either to debug or to be informed of the process. However, if the output is lengthy, it is difficult to keep track of the output.
Is there a way that you can make the important terminal output stand out more such as adding color, and enlarging the text like below?
Data scientist. I share a little bit of goodness every day through articles and daily data science tips: https://mathdatasimplified.com/