Cover Courtesy: Photo by Green Chameleon on Unsplash

What is MLS-C01 and how is it helpful ?

If you crack the MLS-C01 exam, I can assure you, it can help you in your role at your current company, may be a promotion or some other benefit. It definitely helped me to set myself apart from the crowd in my company, and helped me get a promotion as well. It depends on person to person and opportunities in company.

The certification preparation, definitely helped me get a broader perspective, and enhance my learning beyond my area of expertise ie. traditional ML algorithms for a long time.

  • The best part about MLS-C01 is that it tests practical knowledge on your experience in developing and debugging models
  • Around 40% to 50% of questions test the knowledge you have in data science, data engineering, feature engineering and model training
    • The exam can touch upon any algorithm outside of built-in algorithm in Sagemaker - not like any ML algorithm on earth, but the most used and popular ones
    • Need to have a good knowledge on deep learning frameworks and and building models using them
  • Rest of the questions will be from AWS Sagemaker, its built in algorithms, training on AWS Sagemaker and using AWS resources to deploy models in production
  • People who do not have experience in ML from their job have also passed the exam, but they have put in their efforts as well

How did I prepare ?

Most of my learning comes from my on job experience. However, everyone needs exam specific preparation, job experience doesn’t cover the breadth of the exam

Video tutorials

  • There are many video tutorials in Udemy, I specifically used the one from Frank Kane and Stephane Mareek
  • Another video tutorial and a good bank of practice questions are available on www.aws.training
    • It is the exam readiness course on aws.training which has a practice exam for free of cost, containing 35 questions (at the end of the course)
    • At the end of every module, it has 4 to 5 questions which are again bonus preparation material
    • Above is a game changer and helps gain confidence
  • Another source would be to purchase practice exams on Udemy and go through them again and again. Test yourself until you score above 95%

Sudo Tip:

  • If you are in a time crunch and didn’t get a chance to use Sagemaker hands-on in your job, here is a tip: Watch these video series from Emily Webber on this YouTube playlist
  • These are deep dive sessions on AWS Machine Learning and SageMaker
  • If you find more time, download and run the example notebooks presented in above Deep Dive Videos
    • Explore these notebooks in detail if you are interested in true learning beyond exam preparation
    • These are guided notebooks, no need of videos to be followed. All the necessary explanations are in the notebook
    • Believe me! it’s a goldmine of information on Machine Learning with AWS
  • Watch all the videos, don’t skip any of them (may be you can skip SageMaker Studio, as it doesn’t show up on exam yet, as of writing this blog)I found them extremely useful evn though I had experience using Sagemaker.

Few pointers from my experience on exam

  • tSNE technique comes up very often. Read as much as you can about tSNE as a dimensionality reduction technique and as a visualization technique
  • Read about how to calculate TP, TN, FP, FN using a different kind of confusion matrix which has them represented as heat map and percentages. It also has F1 scores and total samples as data represented on confusion matrix. I couldn’t answer this kind of a question. Better to be prepared
    • A confusion matrix is very confusing. Until you take a paper and a pen, solve many numerical questions, you can never master it
    • Go back to college mode, practice this again and again
  • More of machine learning concepts, how to reduce over fitting, how to reduce under fitting, both in cases of traditional ML algorithms and DNN DNN concepts, hyperparameters, architectures in DNN ( CNN, RNN etc) - check my blog on hyperparameters for more details.
  • Kinesis streams in deep and the kind of integrations supported in detail - check my blog on kinesis streams
  • Builtin algorithms in deep - should know what they support and don’t support. Check my built-in algorithms comparison blog
  • Tricky questions in statistics and data preparation
    • About distributions being right tailed or left tailed
    • What kind of transformations to use to make them normal
  • Identification of which distribution to use, when a scenario is given e.g. number of times a bus arrives at a station in a given interval would be Poisson
    • Similarly study all other distributions
  • If you don’t have experience in ML - Focus on Data science techniques, data exploration and statistics part. Else this will be easy and you’ll have to focus more on Sagemaker and MLOps (vice versa if you have sagemaker experience)
  • MLOps using sagemaker
    • Deployments
    • A/B testing
    • Canary deployments i.e. using [production variants(https://docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html)]
    • Setting up autoscaling
    • Multi model deployments etc.
  • Know data stores very well (based on use case) i.e. OLTP and OLAP difference
  • How to form data lakes. Cost of storage i.e. EFS, EBS, instance store, S3, etc.
  • Know when to use redshift v/s athena v/s/ redshift spectrum
  • Few security concepts on securing data at rest and in transit, VPCs with Sagemaker etc

Sudo-code Tips

  • As soon as you enter the exam center (I prefer exam center over online tests at home), write down all the formulae you need for the exam, on the rough sheet given to you. This serves as a quick reference and saves a lot of time
    • Write down a confusion matrix in two ways, this will help pick right numbers from right cells in the matrix for calculations
      • Predicted on x-axis and Actual on y-axis
      • The other way round, i.e. Predicted on y-axis and Actual on x-axis
    • Write down formulae for the following
      • Recall or Sensitivity or True Positive Rate
      • Precision or Positive Predictive Value
      • Specificity or True Negative Rate
      • F1 Score in two ways
        • Using TP, FP and FN
        • Using Precision and Recall
    • Write down formula for calculating number of kinesis shards
    • Algorithm specific formulae e.g. TF-IDF calculation

Sudo Exam Tip Check out this special, sudo-code authored formulae cheat sheet for easy exam preparation! The cheat sheet also has solved numerical questions to refer and learn.

I wish you all the best for your exam! Take a leap of faith with right preparation, and you shall conquer!