Use Amazon Q Developer to build ML models in Amazon SageMaker Canvas

As a data scientist, I have experienced firsthand the challenges of making machine learning (ML) accessible to business analysts, marketing analysts, data analysts, and data engineers who are experts in their fields without ML experience. That’s why I’m especially excited about Amazon Web Services’ (AWS) announcement today that Amazon Q Developer is now available in Amazon SageMaker Canvas. What catches my attention is how Amazon Q Developer helps connect ML expertise with business needs, making ML more accessible across organizations.

Amazon Q Developer helps domain experts build accurate, production-quality ML models through natural language interactions, even if they don’t have ML expertise. Amazon Q Developer guides these users by dissecting their business problems and analyzing their data to recommend step-by-step instructions for building their own ML models. It transforms users’ data to remove anomalies and builds and evaluates custom ML models to recommend the best one, while giving users control and insight into every step of a guided ML workflow. This enables organizations to innovate faster and reduce time to market. It also reduces their reliance on ML experts, so their specialists can focus on more complex technical challenges.

For example, a marketing analyst might say, “I want to predict home sales prices using home characteristics and past sales data,” and Amazon Q Developer translates that into a set of ML steps, analyzing relevant customer data, building multiple models, and best-practice recommendations.

Let’s see it in action
To get started with Amazon Q Developer, I follow the Getting Started with Amazon SageMaker Canvas guide to run a Canvas application. In this demo, I use natural language instructions to build a real estate price forecasting model for the marketing and finance teams. I select from the SageMaker Canvas page Amazon Q and then choose Start a new conversation.

In a new interview I write:

I am an analyst and need to forecast real estate prices for my marketing and finance teams.

Next, Amazon Q Developer explains the problem and recommends an appropriate type of ML model. It also outlines the solution requirements, including the necessary data set characteristics. Amazon Q Developer then asks if I want to upload my data file gold I want to select the target column. I select it to upload my dataset.

In the next step, Amazon Q Developer lists requests for datasets that include relevant information about homes, current home prices, and target variables for the regression model. He then recommended further steps, including: I want to upload my data file, Select an existing dataset, Create a new dataset gold I want to select the target column. For this demo I will use canvas-sample-housing.csv sample data file as my existing data file.

select_an_existing_dataset

After selecting and loading a dataset, Amazon Q Developer analyzes and designs it median_house_value as the target column for the regression model. I accept by choice I would like to predict the “average_house_value” column. In the next step, Amazon Q Developer details which dataset features (such as “location”, “housing_median_age”, and “total_rooms”) will be used to predict the median_house_value.

Before I start training the models, I ask about the quality of the data, because without good data we cannot build a reliable model. Amazon Q Developer responds with quality statistics for my entire dataset.

I can ask specific questions about individual features and their distributions to better understand the quality of the data.

columns in the dataset

To my surprise, I found through the previous question that the “households” column has large differences between the extreme values, which could affect the accuracy of the model’s prediction. So I’m asking Amazon Q Developer to fix this remote issue.

After the transformation is complete, I can ask what steps Amazon Q Developer took to make this change. Behind the scenes, Amazon Q Developer uses advanced data preparation steps using the SageMaker Canvas data preparation options, which I can preview and see the steps to visualize and replicate the process to get a final, prepared dataset for model training.

After reviewing the data preparation steps, I select Start my training job.

start training work

After running the training task, I can see its progress in the conversation and the datasets created.

As a data scientist, I especially appreciate being able to see detailed metrics with Amazon Q Developer, such as confusion matrix and recall accuracy scores for classification models and root mean square error (RMSE) for regression models. These are the key elements I always look for when evaluating model performance and making data-driven decisions, and it’s refreshing to see them presented in a way that is accessible to non-technical users to build confidence and enable proper management while maintaining the depth that technical teams need.

You can access these metrics by selecting a new model from My models or from Amazon Q conversation menu:

Overview – This tab displays Impact on the column analysis. in this case median_income appears to be the primary factor influencing my model.
Scoring – This tab provides information about model accuracy, including RMSE metrics.
Advanced Metrics – This tab displays detailed information Metrics table, Leftovers and Error density for an in-depth evaluation of the model.

Analyze my model

After checking these metrics and validating the model’s performance, I can move on to the final stages of the ML workflow:

Predictions – I can test my model using Predictions to verify its actual performance.
deployment – I can create an endpoint deployment to make my model available for production use.

This simplifies the deployment process, a step that traditionally requires significant DevOps expertise, to a straightforward operation that business analysts can handle with confidence.

forecasting and deployment

Things you should know
Amazon Q Developer democratizes ML across organizations:

Empowering all skill levels with ML – Amazon Q Developer is now available in SageMaker Canvas and helps non-ML business analysts, marketing analysts, and data professionals build solutions to business problems through a guided ML workflow. From data analysis and model selection to deployment, users can solve business problems using natural language, reducing reliance on ML experts such as data scientists and enabling organizations to innovate faster and reduce time to market.

Streamlining the ML workflow – With Amazon Q Developer available in SageMaker Canvas, users can prepare data and build, analyze and deploy ML models through a controlled, transparent workflow. Amazon Q Developer provides advanced data preparation and AutoML capabilities that democratize ML and enable non-ML experts to build highly accurate ML models.

Providing a full overview of the ML workflow – Amazon Q Developer provides full transparency by generating underlying code and technical artifacts such as data transformation steps, model explainability, and accuracy measurements. This allows cross-functional teams, including ML experts, to review, validate and update models as needed, facilitating collaboration in a secure environment.

Availability – Amazon Q Developer is now in Preview in Amazon SageMaker Canvas.

Prices – Amazon Q Developer is now available in SageMaker Canvas at no additional cost to both Amazon Q Developer Pro and Amazon Q Developer Free users. However, standard fees apply for resources such as SageMaker Canvas workspace instances and any resources used to create or deploy models. For detailed pricing information, visit Amazon SageMaker Canvas Pricing.

To learn more about how to get started, visit the Amazon Q Developer product page.

— Eli

Use Amazon Q Developer to build ML models in Amazon SageMaker Canvas | Amazon Web Services

Leave a Comment Cancel reply