This post outlines the ETL pipeline we developed for feature processing for training and deploying a job recommender model at Talent.com. Our pipeline uses SageMaker Processing jobs for efficient data processing and feature extraction at a large scale. Feature extraction code is implemented in Python enabling the use of popular ML libraries to perform feature extraction at scale, without the need to port the code to use PySpark.
Originally appeared here:
Streamlining ETL data processing at Talent.com with Amazon SageMaker
Go Here to Read this Fast! Streamlining ETL data processing at Talent.com with Amazon SageMaker