Skip to main content

Course - Feature Engineering

Introduction

Want to know how you can improve the accuracy of your ML models? What about how to find which data columns make the most useful features? Welcome to Feature Engineering where we will discuss good vs bad features and how you can preprocess and transform them for optimal use in your models.

  1. Introduction to Feature Engineering
  2. Intro to Qwiklabs

Raw Data to Features

Feature engineering is often the longest and most difficult phase of building your ML project. In the feature engineering process, you start with your raw data and use your own domain knowledge to create features that will make your machine learning algorithms work. In this module we explore what makes a good feature and how to represent them in your ML model.

  1. Raw Data to Features
  2. Good vs Bad Features
  3. Quiz: Features are Related to the Objective
  4. Features Known at Prediction-time
  5. Quiz: Features are Knowable at Prediction Time
  6. Features should be Numeric
  7. Quiz: Features Should be Numeric
  8. Features Should Have Enough Examples
  9. Quiz: Features Should Have Enough Examples (p1)
  10. Quiz: Features Should Have Enough Examples (p2)
  11. Bringing Human Insight
  12. Discussion Prompt: Choosing Relevant Features
  13. Representing Features
  14. ML vs Statistics
  15. Lab Solution: Improve model accuracy with new features

Raw Data to Features

Qwiklabs -- Improve model accuracy with new features

Representing Features

Preprocessing and Feature Creation

This section of the module covers pre-processing and feature creation which are data processing techniques that can help you prepare a feature set for a machine learning system.

  1. Preprocessing and Feature Creation
  2. Beam and Dataflow
  3. Lab Intro: Simple Dataflow Pipeline
  4. Lab Solution: Simple Dataflow Pipeline
  5. Data Pipelines that Scale
  6. Lab Intro: MapReduce in Dataflow
  7. Lab Solution: MapReduce in Dataflow
  8. Preprocessing with Cloud Dataprep
  9. Lab Intro: Computing Time-Windowed Features in Cloud Dataprep
  10. Lab Solution: Computing Time-Windowed Features in Cloud Dataprep
  11. Discussion Prompt: Performing Exploratory Analysis

Preprocessing and Feature Creation

Simple Dataflow Pipeline

MapReduce in Dataflow

Apache Beam and Cloud Dataflow

Computing Time-Windowed Features in Cloud Dataprep

Preprocessing with Cloud Dataprep

Feature Crosses

In traditional machine learning, feature crosses don't play much of a role, but in modern day ML methods, feature crosses are an invaluable part of your toolkit.In this module, you will learn how to recognize the kinds of problems where feature crosses are a powerful way to help machines learn.

  1. Introducing Feature Crosses
  2. What is a Feature Cross?
  3. Discretization
  4. Memorization vs. Generalization
  5. Taxi colors
  6. Lab Intro: Feature Crosses to create a good classifier
  7. Lab Solution: Feature Crosses to create a good classifier
  8. Sparsity + Quiz
  9. Lab Intro: Too Much of a Good Thing
  10. Lab Solution: Too Much of a Good Thing
  11. Implementing Feature Crosses
  12. Embedding Feature Crosses
  13. Where to Do Feature Engineering
  14. Feature Creation in TensorFlow
  15. Feature Creation in DataFlow
  16. Lab Intro: Improve ML Model with Feature Engineering
  17. Lab Solution (p1): ML Fairness Debrief
  18. Lab Solution (p2): Improve ML Model with Feature Engineering

Feature crosses

Improve ML Model with Feature Engineering

TF Transform

TensorFlow Transform (tf.Transform) is a library for preprocessing data with TensorFlow. tf.Transform is useful for preprocessing that requires a full pass the data, such as: - normalizing an input value by mean and stdev - integerizing a vocabulary by looking at all input examples for values - bucketizing inputs based on the observed data distribution In this module we will explore use cases for tf.Transform.

  1. Introducing TensorFlow Transform
  2. TensorFlow Transform
  3. Analyze phase
  4. Transform phase
  5. Supporting serving
  6. Lab Intro: Exploring tf.transform
  7. Lab Solution: Exploring tf.transform

Exploring tf.transform

tf.transform

https://heartbeat.fritz.ai/a-practical-guide-to-feature-engineering-in-python-8326e40747c8