Entrevista Verity Devops MLOPS

Name: Entrevista Verity Devops MLOPS
Uploaded: 2026-02-04T14:54:45.000Z
Duration: 1 h 24 min 39 s

Interview with Guilherme: Cloud and DevOps Experience

Introduction and Setup

The conversation begins with greetings between participants, indicating a friendly atmosphere.

Guilherme's resume is mentioned, highlighting the importance of reviewing his qualifications for the interview.

Cáio introduces himself as an SRE and DevOps professional at Vert, emphasizing his expertise in cloud technologies.

Professional Background

Cáio expresses the goal of the interview: to assess Guilherme's knowledge in cloud and DevOps areas.

Guilherme is asked to summarize his experiences over the past five years in cloud environments, noting a background in infrastructure.

Career Progression

Guilherme describes his evolution from infrastructure roles to positions involving DevOps, SRE, and cloud engineering.

He mentions working within cross-functional teams across various banks, which provided him with diverse experiences.

Key Responsibilities

Discussion on working with multiple squads (teams), including managing up to 30 squads simultaneously.

Emphasis on SRE pillars such as observability, security, scalability, and process documentation.

Technical Skills and Tools

Guilherme details his experience with Kubernetes in both on-premises (OpenShift) and cloud environments (AKS/EKS).

He outlines responsibilities related to transitioning systems from on-premises setups to cloud infrastructures.

Project Challenges

Discussion about specific projects involving legacy systems like mainframes; highlights challenges faced during these transitions.

Mention of different strategies used for migration projects—refactor vs. rehost—and their implications for efficiency gains.

Architecture and Technology Stack Discussion

Overview of Project Architecture

The speaker discusses a project that utilized a serverless architecture, mentioning the use of AWS services like Lambda, MSK (Managed Streaming for Kafka), and S3.

Emphasizes the challenges faced during the project, particularly in transitioning from a traditional framework to a more modern stack.

Mentions Java and Spring Boot as part of the technology stack but notes that Spring Boot is not necessarily the best option for microservices in Java.

Experience with Cloud Services

The speaker highlights their experience with multiple cloud platforms, including AWS and Google Cloud (GCP), while working at C6 Bank.

Discusses multi-cloud strategies implemented at Santander, where 80% of operations were on Azure and 20% on AWS.

Describes how disaster recovery was managed between GCP and AWS during outages.

DevOps Practices and Tools

Pipeline Management

Inquires about the candidate's experience with Azure DevOps tools such as GitHub Actions in relation to pipeline management over the last five years.

The speaker mentions using F DevOps for integration pipelines but clarifies that they did not implement it from scratch; rather, they managed existing systems.

Testing Frameworks

Discusses key pillars of DevOps, emphasizing agile delivery and prioritizing people over technology.

Asks about automation tools used within DevOps practices, specifically focusing on Continuous Integration (CI).

Continuous Integration (CI)

CI Process Steps

Outlines important steps in CI: build process followed by testing phases.

Highlights experiences with unit testing frameworks like JUnit for Java applications and performance testing methodologies.

Framework Utilization

Queries about frameworks used for executing tests across different programming languages such as Java or .NET.

Notes experience with various technologies including Java, Python, and specific testing tools like JMeter.

DevOps Practices and Cultural Challenges

Testing Integration with Cypress

The speaker discusses the use of Cypress for local testing, emphasizing that tests were run on personal machines before being pushed to the pipeline, where graphical interfaces were not available.

Cultural Implications of DevOps

The conversation shifts to the cultural aspects of implementing DevOps, highlighting the need for educating team members about new practices. The speaker reflects on their experiences in environments lacking prior knowledge transfer.

Resistance to Change in Traditional Environments

There is significant resistance when introducing new technologies in long-established companies. The speaker notes that many organizations are conservative and hesitant to adopt changes like CI/CD pipelines.

Approaches to Overcoming Resistance

The speaker prefers a collaborative approach over top-down directives. They advocate for demonstrating practical examples (like using Docker) to ease concerns and encourage hands-on learning among team members.

Communication and Engagement Strategies

By bringing ready-to-use components into discussions, the speaker effectively engages stakeholders. This method helps break down barriers and fosters a more interactive environment during training sessions.

Monitoring and Traceability in Deployments

Implementing Traceability Measures

After implementing CI/CD pipelines, traceability was established through automated email notifications for each deployment action taken by users, enhancing accountability within the team.

Monitoring Application Health Post-Deployment

The discussion includes how monitoring is integrated into DevOps practices. Although traditionally seen as an SRE task, it’s crucial for DevOps professionals to ensure application health post-deployment.

Updating Monitoring Tools

New pipelines incorporated libraries like Prometheus directly into codebases for better monitoring capabilities. Older pipelines required manual updates or adjustments to integrate these tools effectively.

Infrastructure as Code (IaC)

Utilizing Terraform and Other Tools

The speaker mentions experience with Terraform and other IaC tools, discussing their integration within cloud environments. They highlight adaptability during incidents requiring migration between cloud services.

Portability of Projects Across Clouds

A specific incident is referenced where a project was successfully migrated from one cloud provider to another using Kubernetes deployments alongside Terraform configurations, showcasing flexibility in infrastructure management.

Infrastructure Deployment and Management in Multi-Cloud Environments

Overview of Infrastructure Flexibility

The discussion begins with the importance of cross-cloud flexibility, highlighting how infrastructure pipelines were integrated with tools like Terraform to manage deployments effectively.

Emphasis is placed on the testing of disaster recovery (DR) processes, which were conducted monthly and weekly to ensure readiness for potential failures.

Incident Response and Recovery

When an incident occurred in Virginia, the team successfully transitioned their infrastructure to Google Cloud Platform (GCP), utilizing a tool referred to as "B" for deployment.

The infrastructure was already validated; thus, the transition involved deploying various components such as GKE (Google Kubernetes Engine), load balancers, and CDNs swiftly.

Azure Environment Considerations

The conversation shifts focus towards Azure cloud environments, prompting questions about managing Infrastructure as Code (IAC) within corporate settings.

A scenario is presented regarding organizing Terraform modules to allow multiple teams to provision resources while adhering to security policies.

Security Strategies in Terraform Projects

The speaker discusses modularization within Terraform projects, emphasizing its necessity for multi-cloud operations and resource management without compromising security.

An example is provided where specific tiers are validated for deployment, preventing unauthorized resource provisioning outside established parameters.

Access Control Mechanisms

To maintain security, it’s crucial that teams cannot deploy resources indiscriminately. This involves implementing strict access controls based on predefined roles.

AWS IAM (Identity and Access Management) is mentioned as a tool used for enforcing minimum access permissions by creating granular policies tied to specific resources.

Evaluating Security Risks

A hypothetical situation illustrates how unauthorized attempts to deploy services could lead to data leaks; hence strict monitoring and permissioning are essential.

The concept of minimal access is reiterated—ensuring users can only deploy within defined scopes prevents potential security breaches during cloud operations.

Terraform Project Management and Access Control

Managing Multiple Terraform Projects

Discussion on managing multiple Terraform projects within a single cloud environment, each representing different business units.

The necessity for a system administrator (SIS admin) or Site Reliability Engineer (SRE) to restrict access to resources across all accounts in the organization.

Strategies for Global Access Control

Inquiry into strategies for applying global access rules without configuring permissions individually for each user or profile.

Mention of AWS Organizations as a method to configure global permissions from the root account, which can replicate settings across other accounts.

Governance and Security Integration

Reference to governance teams working alongside security teams to implement organizational-wide rules through CI/CD pipelines.

CI/CD Pipeline Structure and Image Immutability

Structuring CI/CD Pipelines

Question posed about structuring a pipeline that consistently uses the same validated container image for production deployment.

Ensuring Immutability and Traceability

Exploration of how to ensure immutability and traceability of container images within pipelines, emphasizing the importance of using approved images only.

Discussion on utilizing AWS services like ECR (Elastic Container Registry) for storing validated images, ensuring that only pre-approved versions are used in deployments.

Updating Images in Pipelines

Explanation of how new versions of images can be built and pushed directly to ECR while maintaining control over which versions are available for use.

Immutability Techniques in Cloud Environments

Configuring Image Immutability

Inquiry into methods for making container images immutable within AWS environments, seeking examples of services or techniques applicable.

Observability in SRE Practices

Defining SLI Implementation

Discussion on observability from an SRE perspective, focusing on defining Service Level Indicators (SLIs).