Skip to the content.

AWS re:Invent 2024 - Cloud Operations

Back to all Re:Invent 2024

Table of Contents

Analytics

AWS re:Invent 2024 - Building for the future: Enterprise-scale AI and analytics (AIM125)

Lennar, a leading home builder, partnered with PwC and AWS to build a modern data and AI platform that enabled enterprise-scale analytics and AI. The platform's key features include a scalable data architecture, robust data quality and governance, and an MLOps platform for industrializing machine learning, resulting in significant cost savings, increased productivity, and a foundation for future generative AI use cases.

Backup and Disaster Recovery

AWS re:Invent 2024 - Backup and disaster recovery strategies for increased resilience (COP319)

This session discusses backup and disaster recovery strategies for increased resilience in the AWS cloud. The speakers cover key concepts like RPO and RTO, tiering applications, using AWS Elastic Disaster Recovery Service and AWS Backup, and implementing best practices for cyber resilience and testing recovery capabilities.

Compute

AWS re:Invent 2024 - What's new with AWS cost optimization (COP204)

This session covers the latest AWS cost optimization features, including the new Savings Plan Purchase Analyzer, Dynamo DB Reserved Capacity Recommendations, and Idle Resource Recommendations across various AWS services. The presenters provide detailed demonstrations and insights into how these tools can help AWS customers better manage and optimize their cloud costs.

AWS re:Invent 2024 - Achieving governance at scale (COP383)

This session discusses best practices for achieving governance at scale in a multi-account AWS environment. Key topics covered include multi-account structure, security, inventory management, cost optimization, and real-world learnings from Workday's governance journey.

AWS re:Invent 2024 - Control the cost of your generative AI services (COP203)

This presentation provides a comprehensive overview of strategies to control the cost of building and running generative AI services on AWS. The speakers cover a range of approaches, from self-managed infrastructure to fully managed services like SageMaker and Bedrock, highlighting the key considerations and optimization levers at each stage.

AWS re:Invent 2024 - Best practices and new tools for cost reporting and estimation (COP218)

This talk provides a comprehensive overview of best practices and new tools for cost reporting and estimation in the AWS cloud environment. The speakers cover a range of topics, including data normalization, cost visibility, optimization opportunities, and cost planning, to help organizations gain better control and understanding of their cloud investments.

AWS re:Invent 2024 - New governance capabilities for multi-account environments (COP378-NEW)

This presentation highlights new AWS governance capabilities for multi-account environments, including the launch of Declarative Policies and Resource Control Policies, which enable organizations to centrally define and enforce consistent configurations and access controls across their AWS resources. The session also covers the introduction of Network Activity Events in AWS CloudTrail, which provide enhanced visibility into VPC endpoint traffic and can help inform policy decisions.

AWS re:Invent 2024 - Designing generative AI workloads for resilience (COP332)

The session explores best practices for designing resilient generative AI workloads, covering key aspects such as fault isolation, capacity management, timely output, correct output, and redundancy. The presenters provide a comprehensive overview of the challenges and strategies involved in transitioning from proof-of-concept to production-ready generative AI systems.

Observability

AWS re:Invent 2024 - Byte to insight: Maximize value from your logs with Amazon CloudWatch (COP406)

The session covers new capabilities in Amazon CloudWatch, including metric filters, embedded metric format, contributor insights, and enhanced log analytics, to help maximize value from logs by focusing on the 'what' and 'context' of the insights needed. The presenters demonstrate how these features can be leveraged to optimize cost and gain faster, more accurate insights from logs, especially for known-known and known-unknown scenarios.

AWS re:Invent 2024 - Observability the open source way (COP324)

The session covers the challenges and solutions for open-source observability on AWS, including the benefits of managed services like Amazon Managed Service for Prometheus, Amazon Managed Grafana, and Amazon OpenSearch Service. The speakers discuss how these services can help address issues around scalability, cost-effectiveness, and portability, as well as new features and capabilities to streamline observability workflows.

AWS re:Invent 2024 - Don't get stuck: How connected telemetry keeps you moving forward (COP322)

The talk discusses strategies and tools for effective troubleshooting, including identifying five key causes of incidents, navigating between infrastructure and application layers, and leveraging observability data to quickly diagnose and mitigate issues. The speaker also demonstrates a new AWS CloudWatch feature that uses AI to assist with the troubleshooting process.

AWS re:Invent 2024 - Accelerate innovation with AI-powered operations (COP315)

AWS showcased its latest advancements in AI-powered operations, introducing Amazon Q Developer, a guided troubleshooting experience that leverages machine learning to automatically analyze telemetry data, identify root causes, and suggest mitigation actions, aiming to help customers streamline their cloud operations and reduce incident resolution times.

AWS re:Invent 2024 - Unlocking business insights with AWS Config, featuring Itaú Unibanco (COP326)

This session explores how AWS Config can help organizations unlock business insights by enabling resource configuration management at scale. The presenters showcase how Itaú Unibanco, a leading financial institution, leverages AWS Config to gain visibility, monitoring, and planning capabilities across their decentralized AWS environment, empowering their teams to make informed decisions and optimize their infrastructure.

AWS re:Invent 2024 - [NEW LAUNCH] Investigate operational issues faster with AI (COP379-NEW)

The session discusses how AWS's AIOps capabilities, including CloudWatch Metric Anomaly Detection and CloudWatch Log Pattern Analysis, can help investigate operational issues faster by leveraging AI and machine learning. The presenters also introduce new features like Amazon Q Operational Investigations, which uses generative AI and other ML techniques to automate the investigation process and identify root causes more efficiently.

AWS re:Invent 2024 - [NEW LAUNCH] What’s new with Amazon CloudWatch (COP381-NEW)

The talk introduces new observability features in Amazon CloudWatch, including CloudWatch Database Insights, CloudWatch unified navigation, and Amazon Q Developer Ops Assistant. These features aim to help developers and DevOps teams quickly detect, investigate, and remediate operational issues in their applications.

AWS re:Invent 2024 - Best practices for end-to-end digital experience monitoring (COP320)

The session covers best practices for end-to-end digital experience monitoring, including using AWS services like CloudWatch RUM, Synthetics, Application Signals, and Database Insights to instrument and monitor the various layers of a modern, distributed application. The presenters emphasize the importance of defining SLOs, instrumenting comprehensively, leveraging distributed tracing, and continuously iterating to optimize the observability of the application.

AWS re:Invent 2024 - Unlock the power of application monitoring (COP359)

This talk explores the power of application monitoring using AWS services, focusing on the new Application Signals feature that provides an opinionated approach to Application Performance Monitoring (APM). The speaker highlights how Application Signals, built on open standards like OpenTelemetry, can help organizations improve application health, performance, and customer experience while reducing operational costs.

AWS re:Invent 2024 - Best practices for generative AI observability (COP404)

The presentation covers best practices for observability in generative AI systems, focusing on four key layers: service-level metrics and logs, orchestration-level tracing, advanced metrics and analysis, and end-user feedback. The speakers demonstrate how to leverage AWS CloudWatch to implement observability across these layers, providing a comprehensive approach to monitoring and understanding the performance and behavior of generative AI applications.

Operations

AWS re:Invent 2024- Scaling IT with the next generation of AWS Systems Manager (COP380-NEW)

The video presents the evolution of AWS Systems Manager, a service that has grown from internal operational tooling to a comprehensive solution for managing compute nodes at scale across hybrid and multi-cloud environments. The key highlights include the introduction of a new integrated experience that provides centralized visibility and control over managed and unmanaged nodes, as well as the emphasis on automation and compliance through pre-built runbooks and diagnostics.

AWS re:Invent 2024 - Implementing application performance monitoring (COP409)

This video discusses how AWS Application Signals, a tool in CloudWatch, can provide a comprehensive view of application performance and help engineers make informed decisions. The speakers, including a representative from PBS, share how Application Signals has improved their operational efficiency and enabled them to proactively address issues before they impact customers.

AWS re:Invent 2024 - Centralize multicloud management using AWS (COP321)

The session focuses on centralizing operations in a multi-cloud environment using AWS services. It demonstrates how AWS can help customers navigate the complexity and cost of operating in a multi-cloud landscape, showcasing solutions for inventory management, patch management, remote access, and observability across different cloud platforms.

AWS re:Invent 2024 - Operating your fleet of resources at scale is easier than you think! (COP325)

This talk discusses how AWS services like CloudWatch, Systems Manager, Config, and CloudTrail can be used to simplify operations and automate at scale. The key takeaways are to simplify scale, bring in intelligence, and automate as much as possible to improve operational efficiency.

AWS re:Invent 2024 - Streamlining application management on AWS (COP328)

This session covers AWS's application operations capabilities, which provide a centralized way to monitor and manage applications on AWS. It also demonstrates how AWS Systems Manager can help streamline application management and operations through features like automation, patching, and remote access.

Security

AWS re:Invent 2024 - Navigating the AWS security controls toolbox (COP361)

This session provides a comprehensive overview of AWS security controls and governance services, including AWS Config, AWS Organizations Policies, and CloudFormation Guard. The presentation demonstrates how these services can be integrated to establish a well-governed and secure cloud environment, enabling organizations to focus on innovation while maintaining the necessary controls and compliance.

AWS re:Invent 2024 - Dive deep on AWS cloud governance (COP402)

This video provides a comprehensive overview of AWS cloud governance, covering account strategy, preventive and proactive controls, and auditing capabilities. It highlights the importance of defining a multi-account strategy, leveraging AWS Control Tower, automating governance through infrastructure as code, and utilizing AWS services like CloudTrail, Config, and Audit Manager to ensure compliance and security.

AWS re:Invent 2024 - Accelerating auditing and compliance for generative AI on AWS (COP327)

This talk discusses the key considerations for accelerating auditing and compliance for generative AI on AWS, including the differences between traditional AI and generative AI, and a framework for deploying and building generative AI applications responsibly. It covers important domains such as accuracy, fairness, privacy, resilience, responsible use, safety, security, and sustainability, and provides practical guidance on leveraging AWS tools and services to simplify the compliance journey.