Course - Feature Engineering

Introduction

Want to know how you can improve the accuracy of your ML models? What about how to find which data columns make the most useful features? Welcome to Feature Engineering where we will discuss good vs bad features and how you can preprocess and transform them for optimal use in your models.

Introduction to Feature Engineering
Intro to Qwiklabs

Raw Data to Features

Feature engineering is often the longest and most difficult phase of building your ML project. In the feature engineering process, you start with your raw data and use your own domain knowledge to create features that will make your machine learning algorithms work. In this module we explore what makes a good feature and how to represent them in your ML model.

Raw Data to Features
Good vs Bad Features
Quiz: Features are Related to the Objective
Features Known at Prediction-time
Quiz: Features are Knowable at Prediction Time
Features should be Numeric
Quiz: Features Should be Numeric
Features Should Have Enough Examples
Quiz: Features Should Have Enough Examples (p1)
Quiz: Features Should Have Enough Examples (p2)
Bringing Human Insight
Discussion Prompt: Choosing Relevant Features
Representing Features
ML vs Statistics
Lab Solution: Improve model accuracy with new features

Raw Data to Features

Qwiklabs -- Improve model accuracy with new features

Representing Features

Preprocessing and Feature Creation

This section of the module covers pre-processing and feature creation which are data processing techniques that can help you prepare a feature set for a machine learning system.

Preprocessing and Feature Creation
Beam and Dataflow
Lab Intro: Simple Dataflow Pipeline
Lab Solution: Simple Dataflow Pipeline
Data Pipelines that Scale
Lab Intro: MapReduce in Dataflow
Lab Solution: MapReduce in Dataflow
Preprocessing with Cloud Dataprep
Lab Intro: Computing Time-Windowed Features in Cloud Dataprep
Lab Solution: Computing Time-Windowed Features in Cloud Dataprep
Discussion Prompt: Performing Exploratory Analysis

Preprocessing and Feature Creation

Simple Dataflow Pipeline

MapReduce in Dataflow

Apache Beam and Cloud Dataflow

Computing Time-Windowed Features in Cloud Dataprep

Preprocessing with Cloud Dataprep

Feature Crosses

In traditional machine learning, feature crosses don't play much of a role, but in modern day ML methods, feature crosses are an invaluable part of your toolkit.In this module, you will learn how to recognize the kinds of problems where feature crosses are a powerful way to help machines learn.

Introducing Feature Crosses
What is a Feature Cross?
Discretization
Memorization vs. Generalization
Taxi colors
Lab Intro: Feature Crosses to create a good classifier
Lab Solution: Feature Crosses to create a good classifier
Sparsity + Quiz
Lab Intro: Too Much of a Good Thing
Lab Solution: Too Much of a Good Thing
Implementing Feature Crosses
Embedding Feature Crosses
Where to Do Feature Engineering
Feature Creation in TensorFlow
Feature Creation in DataFlow
Lab Intro: Improve ML Model with Feature Engineering
Lab Solution (p1): ML Fairness Debrief
Lab Solution (p2): Improve ML Model with Feature Engineering

Feature crosses

Improve ML Model with Feature Engineering

TF Transform

TensorFlow Transform (tf.Transform) is a library for preprocessing data with TensorFlow. tf.Transform is useful for preprocessing that requires a full pass the data, such as: - normalizing an input value by mean and stdev - integerizing a vocabulary by looking at all input examples for values - bucketizing inputs based on the observed data distribution In this module we will explore use cases for tf.Transform.

Introducing TensorFlow Transform
TensorFlow Transform
Analyze phase
Transform phase
Supporting serving
Lab Intro: Exploring tf.transform
Lab Solution: Exploring tf.transform

Exploring tf.transform

tf.transform

https://heartbeat.fritz.ai/a-practical-guide-to-feature-engineering-in-python-8326e40747c8