• Scalable Machine Learning for Small Teams

    System Design Training

    Nowadays, Data Scientists are expected to build distributed systems that are scalable and robust. The systems can run distributed programs in parallel but must be resilient to recover from failures. In this project, I will build a scalable system with solid tools such as PySpark which let a data scientist build end-to-end programs more efficiently and quickly. For teams of small size (like start-ups, small companies, or limited budget and resource projects), we want to take advantage of the handful of tools (such as cloud environment and existing ML libraries). Google Cloud Platforms provide a lot of solid environments and…

  • Data Structures and Algorithms for Interview

    During my Ph.D., I spent years optimizing the codes for my simulation at super high scales of volume. I optimized iteration loops, implemented math libraries for high-performance matrix calculations, and examed built-in functions in MATLAB vs. my C functions. Fortunately,…

  • Machine Learning: Natural Language Processing

    There is a lot of excitement around the potential of natural language processing (NLP) with machine learning. Some believe that NLP with machine learning will soon be able to automatically generate translations that are as accurate as those produced by…

  • time series forecasting

    Time Series Forecasting-Store Sales 

    Welcome to my Machine Learning project to predict the sales for stores of a grocery retailer. These datasets have a lot of useful and actual information for a specific case of Time Series Forecasting. I used time-series forecasting to forecast…

  • Carbon nanotube

    Prediction of Electrical Conductivity for Nanocomposites

    Welcome to my Machine-Learning project to predict the electrical conductivity of Nanofillers Reinforced Polymer Nanocomposites. The popular polymers are not normally conductive (except some conductive polymers). To make them conductive, the materials can be reinforced with conductive nanofillers. Common nanofillers…

  • Hello!

    I’m Linh Hoang, a Data Scientist with a strong eagerness to learn new things and share what I understand to help others and receive feedback. Feel free to leave what you think!

  • Lake Louis

    Learn and Share

    I am eager to learn new things, new technologies and would like to share what I have learned and my understandings to help others as well as receive more feedback to keep myself up to date. Feel free to read…

  • recommendation systems

    Recommendation systems: Amazon Products and Reviews

    Nowadays, recommendation systems are one of the most important keys for the success of e-commerce platforms such as Amazon and other online retailers, because: It recommends the user find the right product. It recommends the users to other interesting products…

  • Poetry Machine

    Poetry Machine 3.0

    Poetry Machine is a non-profit project for the Vietnamese community. Introduction Nowadays, young people get used to many types of modern entertainment and forget the traditional culture such as novels or poems. Poetry Machine is a non-profit project which aimed…

  • Zimbra as a service

    Zimbra multi-tenancy service

    In 2011, I developed a Zimbra-as-a-service module that was open-source to help enterprises and businesses ease their mail-task on the system. The free open-source version of Zimbra was limited features. Hence, this tool was built to allow the system admin…

  • cms hlp4ever

    Content Management System in PHP/HTML/CSS/JavaScript

    It was my first comprehensive personal project when I started learning PHP in 2010. The program does not run properly with the newest versions of PHP as the main code was based on PHP5 and was not maintained with many deprecated…