Hello! I’m Khuyen Tran. I have been writing on Medium since December 2019, but I haven’t properly introduced myself so I wrote this article to do so.
I major in statistics, but I love playing with data science and Python tools and share them with others in my free time. Thus, I decided to write at least one article per week. At the point of writing this article, I have written a total of 97 articles.
I love open-source tools, but it can be difficult to understand what they do without spending hours on them. Thus, some cool packages are…
Have you ever wanted to explore a dataset in a browser or publish your dataset so that others can explore and download your data? If so, try Datasette.
Below is how the website for your data will look like after publishing it with Datasette.
Before digging into the article, you can try to explore the FiveThirtyEight’s Hate Crimes dataset using Datasette first.
Datasette is a tool for exploring your data in a web browser and publishing it as an interactive website.
To install Datasette, type:
pip install datasette
If this doesn’t work for you, find other ways to install Datasette…
Have you ever struggled with the math concepts of a machine learning algorithm and used 3Blue1Brown as a learning resource? 3Blue1Brown is a famous math YouTube channel created by Grant Sanderson. Many people love 3Blue1Brown because of Grant’s great explanation and the cool animations like below.
Wouldn’t it be cool if you can learn how he created these animations so you can create similar animations to explain some data science concepts to your teammates, managers, or followers?
Luckily, Grant puts together a Python package called manim that enables you to create mathematical animation or pictures using Python…
Have you ever wanted to create a web application in only several lines of Python code? Streamlit allows you to do that, but it doesn’t give you a lot of options to customize your input box, output, layout, and pages.
If you are looking for something that is easier to learn than Django and Flask, but more customizable than Streamlit, you will love PyWebIO.
PyWebIO is a Python library that allows you to build simple web applications without the knowledge of HTML and Javascript. PyWebIO can also be easily integrated into existing web services such as Flask or Django.
To…
As a data scientist or programmer, the terminal might be your to-go tool for much of your work. Wouldn’t it be nice if you can view your schedule, todo list, system information, docker, git, and updates of your favorite blogs like below:
That is when WTF comes in handy.
WTF allows you to monitor systems, services, and important information. Here are a few things that you can view with WTF:
and many more! In this article, I will show you how to add these pieces of information to…
After cleaning the data, creating beautiful visualizations, and getting some helpful insights, you want to share your findings with your social network, friends, or managers, but find it difficult to do so using Jupyter Notebook.
There are ways for you to share your notebook using Binder or GitHub, but they are not interactive, and it takes time for you to do so.
Is there a way that you can create a report for your findings like below in a few lines of code using Python?
That is when Datapane comes in handy.
Datapane is an API for…
A t-test is useful to find out whether there is a significant difference between two groups. However, a t-test cannot be used to compare between three or more independent groups.
For example, if you are a product developer, you might want to know whether the change in the percentage of cotton used in the synthetic fiber will result in an increase in the tensile strength of the fiber.
To find out, you can conduct a completely randomized experiment with five levels of cotton content and replicate the experiment five times representing five trials. …
Disclaimer: This article is only for educational purposes. We do not encourage anyone to scrape websites, especially those web properties that may have terms and conditions against such actions.
GitHub is a website that helps developers store and keep track of their code. I like GitHub not only because it allows me to control my code, but it also helps me discover interesting libraries, projects, and cool developers.
There are over 56 million developers in GitHub. This means just by exploring GitHub, you can potentially find your software’s partners or your next employee. …
As a data scientist, you most likely want to share the helpful modules you created with your teammates or other users. Although your module might be useful, others will not use it if it takes them a lot of effort to access the useful functions in your module.
Thus, you want to make it easy for users to use your module. The code to import and run your module should be short. In this article, I will show you 3 ways to make it easy to import and execute your Python modules.
Imagine we have a file called utils.py
that…
Did you create some useful functions or classes that you keep reusing again and again in multiple projects? Instead of copying and pasting the same code, wouldn’t it be convenient if you could just import your useful functions like below in any of your projects?
To make your package accessible to you and others, publish to PyPI. I’m sure you have come across PyPI when trying to install some Python packages. Anybody can upload their package to PyPI, including you.
Does your package need to be complicated to upload to PyPI? No! …
Data scientist. I share a little bit of goodness every day through daily data science tips: https://mathdatasimplified.com/