What is Amazon SageMaker Studio Lab?
Amazon SageMaker Studio Lab is a cloud-hosted (AWS, of course), machine learning environment that is free to use (once you are in the limited release program). It's great for teaching, learning, and trying out new concepts in data science.
Amazon SageMaker Studio Lab is meant to give a new aspiring data scientist a place to just focus on data science –-- where you dont have to worry about instances or configurations. It focuses on the Jupyter Lab + Git workflow experience (with both CPU and GPU resources).
Some quick notes:
- No-charge, no billing, no AWS account required
- Based on open-source JupyterLab, so you can install open-source JupyterLab extensions
- CPU (
t3.xlarge) and GPU (
g4dn.xlarge) option (check out our article on AWS GPUs for more context here)
- 15GB persistent storage and 16GB of RAM
- Git integration: Git command line and Git UI / Share content via GitHub
- You can export your projects and transition from Studio Lab to production-grade SageMaker if needed
- Package management: Persistent installation of pip/conda packages within notebooks and from command line
- Provides terminal access
Amazon SageMaker Studio Lab is currently in preview. It is similar in functionality and positioning to Google Cloud's Colab
. Amazon SageMaker Studio Lab is based on the same architecture and user interface as SageMaker Studio, but with limited compute and storage and a subset of SageMaker Studio capabilities.
In the image below you can see the launch screen for Amazon SageMaker Studio Lab.
AWS created Amazon SageMaker Studio Lab because they wanted to eliminate having to “create an account” or “pull out a credit card”.
AWS wanted to give the users a sandbox where they didn't have to worry about leaving instances running and running up a big bill.
Amazon SageMaker Studio Lab has a upgrade path for moving to full-blown Studio "when you are ready" as well.
To request access to Amazon SageMaker Studio Lab check out this link.
Comparing Amazon SageMaker Studio Lab and Amazon SageMaker Studio
Now that we understand the basics about Amazon SageMaker Studio Lab, let's take a look at how Amazon SageMaker Studio is different from the original Amazon SageMaker. We start out by defing exactly "what is Amazon SageMaker?"
Defining Amazon SageMaker
Amazon SageMaker is defined as a collection of components on the AWS platform that include:
- AWS Console
- SageMaker Notebooks
- SageMaker Studio
- SageMaker SDK
- SageMaker Containers
- Built-in Algorithm Containers
- Container Orchestration
Many of the core differences in Amazon SageMaker's notebook system and Amazon SageMaker Studio are subtle but we call them out below.
The original Amazon SageMaker platform on AWS allowed you to configure your on Notebook instance on AWS to run your machine learning workflow. A SageMaker notebook instance is a machine learning (ML) compute instance running the Jupyter Notebook App.
The user can focus on building models while SageMaker manages creating the instance and associated AWS resources.
Amazon SageMaker Studio
Amazon SageMaker Studio is an evolution of SageMaker. A major improvement is how notebook servers are provisioned and operated by making it easier to get your notebook environment working. Amazon SageMaker Studio allows you to manage this from a single screen.
The Evolution of Amazon SageMaker Notebooks
|Amazon SageMaker Version
|Amazon SageMaker (original)
||Through AWS Console
||Creates a new instance to run the notebook server.
All notebooks running on the notebook server
||Have to manually manage the storage attached to specific notebook instances
|Amazon SageMaker Studio
||Studio notebooks managed from inside Studio. JupyterLabs UI is running on a server somewhere
No instance is created, and you are not charged for it
Only charged once you bring up a notebook and start doing some compute
Can run multiple notebook servers running in the background separately. Also: can have different notebook kernels (e.g., TensorFlow, Pytorch, etc)
Plugged into all of the compute instances that are spun up
Centralized Storage Concept (major difference between SM and Studio)
There is a strategic play here where AWS is focusing on Amazon SageMaker Studio. They are showcasing key features in SageMaker Studio such as the plugin system (eg., "SageMaker Pipelines").
It's also worth pointing out EFS Storage as a key differentiator, an architectural fundamental difference between SageMaker Notebooks and SageMaker Studio.
Amazon SageMaker Studio Pricing
There is no additional charge for using Amazon SageMaker Studio. The costs incurred for running Amazon SageMaker Studio notebooks, interactive shells, consoles, and terminals are based on Amazon Elastic Compute Cloud (Amazon EC2) instance usage. When launched, the resource is run on an Amazon EC2 instance of an instance type based on the chosen SageMaker image and kernel. If an instance of that type was previously launched and is available, the resource is run on that instance.
Recommendation: Move to Amazon SageMaker Studio
AWS recommends using Amazon SageMaker Studio over legacy SageMaker when starting a new notebook. Two of the reasons they give for this recommendation are:
- Starting a Amazon SageMaker Studio notebook is faster than launching an instance-based notebook. Typically, it is 5-10 times faster than instance-based notebooks.
- Notebook sharing is an integrated feature in Amazon SageMaker Studio. Users can generate a shareable link that reproduces the notebook code and also the SageMaker image required to execute it, in just a few clicks.
In the image below we can see an example of a notebook launched inside the Jupyter lab system on Amazon SageMaker Studio Lab.
Now that we've covered the differences of the versions of Amazon SageMaker, let's connect our Amazon SageMaker Studio Lab notebook to our Snowflake account.