Open in app

Sign in

Write

Sign in

Khuyen Tran
Khuyen Tran

43K Followers

Home

About

Pinned

About Me

Who I am and What Motivates me to Write — Hello! I’m Khuyen Tran. I have been writing on Medium since December 2019, but I haven’t properly introduced myself so I wrote this article to do so. I major in statistics, but I love playing with data science and Python tools and share them with others in my free time…

About Me

4 min read

About Me
About Me
About Me

4 min read


Published in

Towards Data Science

·Aug 1

How to Build a Fully Automated Data Drift Detection Pipeline

An Automate Guide to Detect and Handle Data Drift — Motivation Data drift occurs when the distribution of input features in the production environment differs from the training data, leading to potential inaccuracies and decreased model performance.

Machine Learning

10 min read

How to Build a Fully Automated Data Drift Detection Pipeline
How to Build a Fully Automated Data Drift Detection Pipeline
Machine Learning

10 min read


Published in

Towards Data Science

·Jul 17

Loguru: Simple as Print, Flexible as Logging

The simple logging solution for your data science project — Originally published at https://mathdatasimplified.com on July 17, 2023. Why Use Logging in a Data Science Project? Data scientists often use the print function to debug their code. However, as the number of print statements increases, it becomes difficult to identify where the output is from due to the lack of line numbers or function names.

Logging

8 min read

Loguru: Simple as Print, Flexible as Logging
Loguru: Simple as Print, Flexible as Logging
Logging

8 min read


Published in

Towards Data Science

·Jul 1

Git Deep Dive for Data Scientists

Learn Git through Real-Life Scenarios — Originally published at https://mathdatasimplified.com on July 1, 2023. Why Choose Git for Your Data Science Project? Git is a version control system widely used in software development, but is it the right choice for your data science project? Absolutely. Here are some reasons why Git is invaluable for data science: Version Control Scenario: You replace the current data processing technique…

Git

9 min read

Git Deep Dive for Data Scientists
Git Deep Dive for Data Scientists
Git

9 min read


Published in

Towards Data Science

·Jun 13

Python Dependency Management: Which Tool Should You Choose?

An in-depth comparison between Poetry, Pip, and Conda — Originally published at https://mathdatasimplified.com on June 13, 2023. Motivation As your data science project expands, the number of dependencies also increases. To keep the project’s environment reproducible and maintainable, it’s important to use an efficient dependency management tool. Thus, I decided to compare three popular tools for dependency management: Pip, Conda…

Python

10 min read

Poetry: A Better Way to Manage Python Dependencies
Poetry: A Better Way to Manage Python Dependencies
Python

10 min read


Published in

Towards Data Science

·Jun 5

Streamline dbt Model Development with Notebook-Style Workspace

Interactively Build and Orchestrate Data Models — Originally published at https://mathdatasimplified.com on June 5, 2023. Motivation dbt (data build tool) is a powerful tool for data transformation within the data warehouse. However, it does come with some limitations, including the following: Lack of Output Preview: With dbt core, it is not possible to preview the output of a…

Python

7 min read

Streamline dbt Model Development with Notebook-Style Workspace
Streamline dbt Model Development with Notebook-Style Workspace
Python

7 min read


Published in

Towards Data Science

·May 26

Stop Hard Coding in a Data Science Project — Use Config Files Instead

How to efficiently interact with config files in Python — Originally published at https://mathdatasimplified.com on May 26, 2023. Problem In your data science project, certain values tend to change frequently, such as file names, selected features, train-test split ratio, and hyperparameters for your model.

Python

6 min read

Stop Hard Coding in a Data Science Project — Use Config Files Instead
Stop Hard Coding in a Data Science Project — Use Config Files Instead
Python

6 min read


Published in

Towards Data Science

·Apr 30

What is dbt (data build tool) and When should you use it?

Discover the Hidden Benefits and Drawbacks of dbt — Motivation If your organization is looking to create a data-driven product, you should consider having efficient data pipelines to: Stay competitive: Quick access to data and its analysis through efficient data pipelines accelerates decision-making and keeps you ahead of your competition. Reduce costs: The time and effort required to collect…

Data Engineering

8 min read

What is dbt (data build tool) and When should you use it?
What is dbt (data build tool) and When should you use it?
Data Engineering

8 min read


Published in

Towards Data Science

·Apr 16

Automate Machine Learning Deployment with GitHub Actions

Faster Time to Market and Increase Efficiency — Motivation Consider this scenario: A more accurate machine learning model is developed every month and added to the main branch. To deploy the model, you must download it to your machine, package it, and deploy it.

Python

8 min read

Automate Machine Learning Deployment with GitHub Actions
Automate Machine Learning Deployment with GitHub Actions
Python

8 min read


Published in

Towards Data Science

·Apr 13

Simplify Data Science Workflows on BigQuery with Fugue and Python

Speed Up Iteration and Cut Computation Cost — Motivation Many data teams begin by establishing an analytics practice on data warehouses such as BigQuery. However, solely relying on BigQuery for data science workloads may not be the best approach due to various reasons: Advanced needs beyond SQL: Use cases such as data validation, visualization, and machine learning forecasting may…

Python

6 min read

Simplify Data Science Workflows on BigQuery with Fugue and Python
Simplify Data Science Workflows on BigQuery with Fugue and Python
Python

6 min read

Khuyen Tran

Khuyen Tran

43K Followers

MLOps Engineer. Website: https://mathdatasimplified.com

Following
  • Dr. Robert Kübler

    Dr. Robert Kübler

  • Sofien Kaabar, CFA

    Sofien Kaabar, CFA

  • Tim Denning

    Tim Denning

  • Priya Dwivedi

    Priya Dwivedi

  • Nathan Rosidi

    Nathan Rosidi

See all (112)

Help

Status

About

Careers

Blog

Privacy

Terms

Text to speech

Teams