It’s a wrap at AWS Re:Invent, but here’s my take on two more data-and-analytics-related announcements from Las Vegas: Amazon Q in Redshift and Amazon Q in AWS Glue.  

 

Amazon Q Recap

 

To my mind, Amazon Q was the most broadly compelling and exciting GenAI announcement at AWS Re:Invent 2023. Amazon Q is a multi-purpose AI assistant for businesses that’s designed to develop a company-specific understanding of information including data, text, code, and technology systems in use. At this point there are 40 connectors to enterprise applications and systems including Salesforce, Zendesk, ServiceNow, Office 365, Dropbox and more.

 

Amazon Q is also designed to deliver a personalized experience to the individual user, limiting access to information based on their role and data-access permissions. The assistant will be available within a growing number of interfaces. Behind the scenes, Q will choose from a variety of GenAI models available in Amazon Bedrock based on the context of where it’s used. Exposed withing the AWS Console, for example, Q will help with cloud troubleshooting and best-practice recommendations. Within an integrated development environment (IDE), Q will help developers generate, test and troubleshoot code. Exposed through Amazon QuickSight (one of the first use cases announced this week), Q will support natural language (NL) query and explanations powered by GenAI.

 

 

Amazon Q in Redshift and Q in Glue

 

This brings us to Amazon Q in Redshift, which will be exposed through the Amazon Redshift Query Editor, the data warehouse service’s web-based SQL editor. Users will simply ask NL questions and Amazon Q will generate SQL recommendations, using the appropriate large language model (LLM) from Amazon Bedrock.

 

According to AWS, Amazon Q will use different techniques, such as prompt engineering and Retrieval Augmented Generation (RAG), to query the model based on context including the database instance, the schema, the user’s query history, and, optionally, the query history of other users connected to the same endpoint. What’s more, Q will remember previous questions and can be used to refine a previously generated query.

According to an AWS blog: “The SQL generation model uses metadata specific to your data schema to generate relevant queries. For example, it uses the table and column names and the relationship between the tables in your database. In addition, your database administrator can authorize the model to use the query history of all users in your AWS account to generate even more relevant SQL statements.”

From a security perspective, Q won’t share query histories with other AWS accounts and it won’t train underlying GenAI models with any data coming from customer AWS accounts. Amazon Q in Redshift is in preview in two U.S. regions (East and West).

Also announced at Re:Invent was Amazon Q in AWS Glue, which is the cloud vendor’s extract, transform, load (ETL) data-integration service. Here, too, the GenAI will generate SQL code, but in this case for ETL jobs and pipelines rather than queries. Q will also support troubleshooting and help assistance. This service is “coming soon” and is not yet available in preview.

MyPOV on Amazon Q in Redshift and Glue

Writing queries and developing SQL ETL jobs and pipelines is tedious, time-consuming work. Code generation, whether for SQL, Python or any other language, has already been proven to be a time- and labor-saving use case for GenAI. Competitors are also pursuing this use case, with Google Cloud having announced GenAI in RedShift rival BigQuery via Duet AI with BigQuery and Duet AI in BigQuery Studio, both of which are in preview at this writing. And in the integration space, vendors including Boomi, Informatica, Snap Logic and Software AG have already jumped on the GenAI bandwagon. 

There are no charges for Amazon Q in Redshift while it’s in preview, but it’s a fair guess that once this feature is generally available, AWS will pass through compute costs, at a minimum, likely through consumption of Redshift Processing Units (RPUs). To my mind the costs of natural language code generation, testing and troubleshooting, and querying and explanation will be well worth it, but it will be up do organizations to understand the value of time savings and making people that much more productive. The danger is that the bean counters and budget holders will have a knee-jerk reaction when the costs of GenAI start to emerge.

While 2023 will go down as the year of GenAI previews, 2024 promises to be the year that the GenAI bills will start to come due. Will need the proverbial business-IT collaboration to develop a clear-eyed understanding of what’s really delivering value.  

Related reading:
AWS Expands Zero-ETL Options, Adds AI Recommendations for DataZone
AWS Introduces Two Important Database Upgrades at Re:Invent 2023

Google Sets BigQuery Apart With GenAI, Open Choices, and Cross-Cloud Querying