5. előadás: ciklomatikus komplexitás és virtualizáció

Name: 5. előadás: ciklomatikus komplexitás és virtualizáció
Uploaded: 2025-04-01T06:29:19.000Z
Duration: 2 h 36 min 32 s

Understanding Cyclomatic Complexity

Introduction to Cyclomatic Complexity

The speaker introduces the topic of cyclomatic complexity, indicating a need for clarity and depth in understanding it.

A reference is made to a previous discussion about a graph that illustrates the concept, suggesting a visual approach to understanding complex systems.

Key Concepts of Cyclomatic Complexity

The formula for calculating cyclomatic complexity is presented: Complexity = (Number of Edges - Number of Nodes + 2 * Number of Connected Components). This formula serves as the foundation for further analysis.

The speaker counts nodes and edges in the graph, noting discrepancies in initial calculations which leads to an adjustment in understanding the complexity.

Analyzing Paths Through Code

The discussion shifts to identifying different paths through code, emphasizing how various routes can lead to the end point within a program's logic.

Examples are provided on how loops and cycles affect path counting, illustrating practical implications on cyclomatic complexity.

Clarifying Misunderstandings

The speaker acknowledges potential confusion regarding path counting and emphasizes systematic approaches to ensure accurate calculations.

A critical point is made about distinguishing between linear independence in control paths versus simple path counting.

Defining Linear Independence

Linear independence is defined concerning control paths; it involves assessing whether new paths contribute unique edges not previously covered by existing paths.

A detailed explanation follows on how one determines if a control path adds new information or coverage within the graph structure.

Conclusion on Path Coverage

The importance of covering all control transitions without redundancy is highlighted as essential for determining when all necessary paths have been accounted for.

A formal definition emerges regarding what constitutes a linearly independent control path based on edge coverage from existing sets.

Control Paths and Linear Independence in Graph Theory

Exploring Control Paths

The speaker discusses the inclusion of a new edge as a linearly independent control path, indicating excitement about this addition.

A proposed control path is described, which involves descending and looping back, highlighting its uniqueness compared to previous paths but noting it lacks linear independence due to not introducing new life.

Path Enumeration

The first control path is enumerated as 1, 5, 7, 11; the second includes an additional edge: 1, 5, 7, 6, 5, 7, 11.

A third path is introduced as: 1, 2, 4, 10, and then adding edges leads to further complexity in enumeration.

Matrix Representation

The speaker requests assistance in constructing a matrix that represents these control paths and their corresponding edges.

The first control path's edges are noted for inclusion in the matrix while emphasizing the need for refinement later.

Analyzing Edge Sets

Subsequent paths are analyzed for their edge sets; discrepancies are noted regarding which edges belong to each path.

Adjustments are made to ensure proper indexing within the matrix representation of these paths.

Understanding Linear Independence

Discussion shifts towards how ranks change within the constructed matrix without manual calculations being necessary.

The concept of linear independence is tied back to how these sets can be viewed as rows representing different control paths.

Challenges with Demonstrating Concepts

Limitations of Current Examples

The speaker reflects on difficulties encountered when trying to demonstrate concepts clearly through examples involving matrices and pathways.

Importance of Coverage Sets

Emphasis is placed on coverage sets; even if a new pathway isn't listed among previously considered ones (like P1 or P4), it may still not contribute new information if it's not linearly independent.

Introduction to Virtualization Concepts

Overview of Virtualization History

Transitioning topics introduces virtualization; a brief historical context suggests its origins trace back to the 1960s rather than being a recent development.

The Origins of Virtualization

Early Developments in Computing

The discussion begins with the origins of virtualization, tracing back to IBM in the 1960s when expensive mainframe computers were primarily used by government ministries.

In the 60s, it was crucial to manage multiple tasks on a single mainframe without interfering with each other's work, leading to early considerations for virtual machines.

The concept emerged where small virtual machines could run on a large computer, allowing users to operate as if they had their own dedicated systems.

Shift in Technology Landscape

By the 1980s, personal computers became more affordable and prevalent, causing a decline in interest for virtualization as people opted for multiple smaller physical machines instead of costly mainframes.

As software evolved, users began running various applications on single machines but soon realized this led to inefficient resource utilization and potential conflicts between applications.

Challenges with Resource Management

Users faced issues when different applications shared resources; for example, an application might disrupt another's functionality due to overlapping resource requests.

This chaotic mixing of services highlighted the need for solutions that allowed separate operations while sharing underlying hardware effectively.

Emergence of Hypervisor Solutions

Hypervisor-based solutions were proposed as a way to create isolated environments within a single machine while managing resources efficiently.

A hypervisor allows multiple operating systems or applications to run concurrently on one physical machine by abstracting hardware resources.

Functionality and Benefits of Hypervisors

Hypervisors can provide services that allow guest operating systems to function independently from one another while still utilizing shared hardware resources effectively.

Modern hypervisors can abstract physical infrastructure so that applications believe they are running on dedicated hardware despite being virtualized.

Memory Management Techniques

When an application accesses memory addresses through a hypervisor, it may not be aware it's operating within a virtual environment; memory translation ensures proper data retrieval regardless of actual physical locations.

Depending on implementation and hardware support, hypervisors can optimize memory access through fast address translations or rely on traditional storage methods if necessary.

Overview of Operating Systems and Hypervisors

Services Provided by Operating Systems

The operating system (OS) offers numerous services that applications can utilize, contrasting with the hypervisor which appears to have limited functionality.

The interface provided by the OS is significantly larger than that of the hypervisor, leading to a need for abstraction layers over hardware.

Functionality of Hypervisors

Hypervisors enable the installation of multiple operating systems by configuring virtual machines with specified resources like memory and disk space.

When an OS runs through a hypervisor, it remains unaware of its virtualized environment, allowing seamless operation as if it were on physical hardware.

Evolution and Adoption

Virtualization technology was already established in the 1990s, with tools like VMWare becoming commonplace in educational settings around 2000.

Users quickly recognized that operating systems are resource-intensive; running multiple instances can lead to inefficient resource allocation.

Resource Management Challenges

There are concerns about resource efficiency when deploying large virtual machines for minimal workloads, highlighting inefficiencies in memory usage.

As virtualization became widespread, it became clear that fine-tuning infrastructure was necessary to avoid excessive resource consumption.

Containerization Solutions

In response to inefficiencies, container solutions emerged that abstracted away traditional operating systems while still utilizing underlying infrastructure.

Containers operate without full access to all OS features; certain functionalities may not be available within containers compared to traditional setups.

Limitations and Current Trends

While containers excel at running backend applications without UI requirements, they struggle with graphical interfaces and native hardware access.

Docker provides guidelines on what can be configured within containers but has limitations regarding UI-based applications.

Conclusion on Virtualization Approaches

Both hypervisor-based virtualization and containerization have their places; data centers often prefer containers due to their lightweight nature for non-GUI tasks.

Hypervisor Technology and Its Evolution

Understanding Hypervisors

The hypervisor layer in operating systems was initially software-based, creating challenges as it emulated hardware functionalities. This required complex software solutions for memory address translation.

As the importance of hypervisors became evident, support transitioned to processor-level implementations, enhancing performance and efficiency in virtualization.

Hardware vs. Software Hypervisors

Modern processors now include both hardware and software support for hypervisors, which is crucial for their effective operation.

A key component for efficient hypervisor functionality is "second level address translation," allowing processes to request memory addresses without direct OS intervention.

Memory Management Challenges

The physical memory is mapped onto a virtual address space that can be duplicated multiple times, complicating how processes access memory.

When using containerization over traditional operating systems, there are lower overhead costs but increased complexity in managing file access requests.

Resource Contention Issues

Virtual machines running on the same physical hardware can lead to resource contention; if multiple applications require more resources than available, they may interfere with each other’s performance.

Applications perceive their allocated resources (e.g., 4GB of RAM), but actual availability may be limited due to shared usage among multiple virtual instances.

Implications for Application Performance

If an application requests resources beyond what is physically available on the host machine, it may experience degraded performance due to paging or disk access delays.

The hypervisor's management of virtualized resources means that applications cannot always trust the reported availability of their requested resources.

Automation in Development Environments

The discussion shifts towards automating quality assurance processes within development environments through containerization strategies.

Development Environments and Containerization

The Importance of Containerization in Development

During development for different companies, specific environments are required (e.g., 80s version), which can conflict. Containerization allows developers to create a standardized environment where all necessary components run together.

Modern development environments support the option to run applications directly on local machines or within containers, providing flexibility based on project needs.

Benefits of Using Containers

Running applications in containers enhances speed, stability, and security. For instance, if a database server runs in a container, it minimizes risks from external factors like accidental deletions by other users on the same machine.

Automation is crucial for testing processes such as unit tests. Developers can enforce that no code commits occur until all tests pass, ensuring quality control before integration.

Challenges with Testing and Integration

Complex systems often require integration with external platforms that may not be accessible to every developer. This limitation can hinder thorough testing on individual machines.

To address these challenges, infrastructure must be built to automate testing processes effectively. Ideally, every change should trigger comprehensive tests without excessive delays.

Prioritizing Tests Based on Relevance

Given resource constraints (time and budget), it's impractical to test everything after each minor change. Instead, prioritize tests based on their relevance to current developments.

A structured approach is needed where critical requirements are identified and tested first while less relevant ones follow later.

Automating Testing Processes

Automated testing frameworks should start from scratch each time changes are made to ensure consistency across different developer environments.

When multiple developers work simultaneously, they need isolated environments tailored to their specific changes for accurate feedback without interference from others' work.

Virtualization Techniques for Development

Effective automation requires either container-based or OS-level virtualization solutions that allow the creation of relevant virtual environments quickly.

Each developer's environment must reflect their last working state accurately so that any changes can be validated against the correct dependencies and configurations.

Virtualization and Containerization in Development

Understanding Virtual Environments

The speaker discusses the necessity of virtualization for creating and discarding environments, emphasizing that this process cannot occur without it.

Various platforms like GitLab, GitHub, and Bitbucket offer automated workflows to create virtual environments based on specified events.

Users can build applications and run unit tests within these virtual environments, which can also support more sophisticated tasks beyond basic operations.

Containerization with Docker

Docker is highlighted as the most popular containerization platform, often mistaken for general containerization due to its widespread use.

The speaker notes that while Docker simplifies the process of containerizing applications, it fundamentally involves creating containers rather than just "dockerizing."

Docker is now available on Windows through Linux Subsystem integration, allowing users to leverage its capabilities across different operating systems.

Utilizing Docker Hub

Docker Hub serves as a public repository for images that can be run in containers; users can upload their own images or access pre-built ones.

The flexibility of Docker allows multiple applications to be packaged together within a single image, enhancing resource management and deployment efficiency.

Layered Architecture in Containers

One significant advantage of Docker is its ability to build containers in layers, enabling efficient organization and reuse of components across different projects.

This layered approach facilitates complex workflows by allowing users to stack services (like databases and application servers), making it easier to manage dependencies.

Automation with Workflows

Virtualized environments enable automated QA processes; similar functionalities are found across platforms like Microsoft Azure's DevOps and GitLab/GitHub workflows.

Understanding Docker and Image Management

Key Concepts of Docker Images

The discussion begins with the explanation of key-value pairs in programming, comparing them to formats like XML or JSON. Indentation signifies levels within the structure.

Emphasis is placed on using advanced text editors (e.g., Notepad Plus, VS Code) for coding, as they help visualize hierarchy levels effectively compared to basic editors.

The speaker introduces the concept of running applications within a Docker container, highlighting that descriptions provided will translate into commands executed on either Windows or Linux systems.

It is noted that instead of manually entering commands in a command prompt, developers can write steps in a more understandable language that reflects their intentions.

An example is given where an image named "gradle alpine" is specified, indicating it runs on Alpine Linux with Gradle installed. This image can be found on Docker Hub.

Navigating Docker Hub

The speaker navigates through Docker Hub to find specific images and mentions the importance of understanding available options and configurations for various images.

A specific version of Java (21 JDK) is mentioned as part of an image tag, illustrating how tags provide clarity about what software versions are included in an image.

The discussion highlights the vast availability of online images and encourages users to leverage existing resources while also being able to create private images if needed.

It’s suggested that common configurations for applications (like SQL servers) are typically available on Docker Hub, making it easier for developers to set up environments quickly.

Understanding Job Structures in CI/CD

The speaker outlines a three-tier organization system commonly used in CI/CD processes: jobs, stages (steps), and their interdependencies. Jobs run independently but can have sequential dependencies defined by the developer.

Each job consists of stages that execute on designated runners. Developers must specify any relationships between these stages based on their execution order requirements.

Examples illustrate how independent APIs may need parallel processing during build and test phases while ensuring deployment occurs only when all components pass tests successfully.

The importance of managing dependencies between different tasks is emphasized; failure in one task should prevent deployment until all related tasks are confirmed successful.

Understanding Artifacts in CI/CD Workflows

The Role of Artifacts

Artifacts are persistent results that facilitate communication between jobs in a CI/CD pipeline, especially when those jobs do not directly share data.

An artifact can be the outcome of unit tests, which may not need to persist unless there is a failure. A successful test yields a green signal, while a failure provides feedback for debugging.

Workflow Management with Artifacts

In scenarios where multiple applications are built and tested, artifacts help manage dependencies and ensure that deployment only occurs after all tests pass successfully.

To avoid redundant builds (which can be costly), artifacts allow the results from one job to be reused in another, streamlining the deployment process.

Benefits of Using Artifacts

Utilizing artifacts reduces resource consumption by preventing unnecessary rebuilds; it ensures that the same build output is used across different stages of deployment.

This method also mitigates risks associated with environmental changes that could affect build outcomes over time.

Managing Dependencies and Steps

The workflow includes steps that can pass data through artifacts, allowing for dependency management where certain jobs wait for others to complete before proceeding.

Each step in this automated platform has specific descriptors detailing how tasks should be executed, including running scripts or commands as needed.

Integration with Platforms

When creating artifacts on platforms like GitHub or GitLab, users specify what types of artifacts they want to generate within their job configurations.

Workflow Automation in Software Development

Understanding Workflow Descriptions

The essence of workflow descriptions is that they should be understandable to non-experts. It's challenging to articulate these processes perfectly, as few can write them from memory.

Editors provided by platforms help automate tasks like running Java applications with Gradle and creating unit tests, making the process more efficient.

Pipeline Functionality

The discussion highlights the simplicity of certain workflows where version control systems are integrated into platforms like GitHub and Azure DevOps.

These systems are designed for automation within version-controlled workflows, allowing for extensive customization based on project needs.

Advanced Configuration Options

Users can configure which branches to run workflows on and set complex conditions based on branch names or other criteria.

In scenarios where project management systems dictate release schedules, additional build options may be necessary beyond standard automated pipelines.

Build Framework Examples

Nightly builds are a common practice where new versions of software are generated overnight for testing; tools like Jenkins and TeamCity facilitate this process.

These frameworks allow for comprehensive infrastructure management, enabling users to automate various tasks such as builds and installer creation.

Testing Strategies in CI/CD

Different testing strategies can be implemented within these frameworks; quick unit tests might run frequently while more complex tests could be scheduled weekly.

Integration with external services is possible, allowing for tailored test runs based on specific development environments or requirements.

Navigating GitHub Workflows

A brief overview of GitHub's interface shows how workflows can be visually represented and monitored during execution.

Cross-Platform Development Challenges

Understanding Cross-Platform Limitations

The speaker discusses their experience with a .NET application, highlighting the promise of cross-platform compatibility through .NET Standard and .NET Core. However, they express concerns about potential native code that may hinder true cross-platform functionality.

Testing Across Different Platforms

There is an emphasis on the importance of testing applications across various operating systems (Windows, Linux distributions). A single test might work on one platform but fail on another due to differences in environment.

Workflow Configuration

The speaker explains how workflows can be configured to run tests automatically across different platforms. Users can specify triggers for these processes based on actions like push or commit in Git.

Permissions and Access Control

Discussion includes setting permissions for workflows, allowing specific access to resources such as reading from GitHub repositories. Successful tests can trigger automated actions like closing issues.

Managing Large Projects

In large projects with multiple applications, it’s crucial to configure workflows effectively. The speaker mentions focusing only on relevant parts of the codebase during testing rather than running tests for all components unnecessarily.

Job Management in Workflows

Job Isolation and Execution Environment

Jobs within workflows are isolated from each other; they do not share information directly. Each job must specify its execution environment (e.g., Ubuntu or Windows), which affects data accessibility between jobs.

Dependency Management Between Jobs

The speaker highlights the need to define dependencies between jobs clearly. If one job depends on another's success, this relationship must be established within the workflow configuration.

Visualizing Job Dependencies

Workflow frameworks provide visualization tools for understanding job dependencies and execution order, making it easier to manage complex workflows involving multiple jobs.

Executing Steps Within Jobs

Defining Job Steps

Each job consists of defined steps that execute specific tasks. These steps require human-readable names for clarity regarding their function within the workflow.

Using Actions in Workflows

The initial step often involves checking out code from a repository using a checkout action. This ensures that the latest version of the code is available for subsequent operations within the workflow.

Accessing Private Actions

Understanding Build Systems and Node.js Setup

Overview of Build Processes

The discussion begins with the possibility of not receiving a desirable outcome from a build process, emphasizing that it is possible to achieve desired results.

Developers are encouraged to build locally to ensure everything functions correctly before integrating into the build system. Custom scripts can be utilized alongside built-in actions.

Utilizing Built-in Actions

An example is provided regarding setting up Node.js, where specific configurations for different versions (10, 12, and 14) can be established.

When configured properly, code can run in parallel across all specified Node.js versions, showcasing the flexibility of the setup.

Importance of Familiarity with Options

It is suggested that understanding available options within the build system is beneficial. While memorization isn't necessary, awareness of possibilities allows for better utilization in future projects.

The speaker notes that certain operating systems like Mac and Windows can be excluded from builds if needed, highlighting customization capabilities.

Conclusion and Next Steps