UNIT-1 PYTHON PROGRAMMING-II || ARTIFICIAL INTELLIGENCE || CLASS-12 AI 843 || CBSE 2025-26

Name: UNIT-1 PYTHON PROGRAMMING-II || ARTIFICIAL INTELLIGENCE || CLASS-12 AI 843 || CBSE 2025-26
Uploaded: 2025-03-24T17:41:27.000Z
Duration: 1 h 32 min 47 s
Description: Syllabus: https://cbseacademic.nic.in/web_material/Curriculum26/SrSec/843-AI-XII.pdf Study Material: https://cbseacademic.nic.in/web_material/Curriculum26/publication/srsec/AI_Teacher_HandbookXII.pdf

Introduction to Python Programming for Class 12 AI Students

Overview of the Unit

This video is aimed at Class 12 AI students, focusing on the first unit: Python programming. The unit will be evaluated through practicals rather than theory exams.

The content builds upon knowledge from Class 11, making it easier for students as they have already covered relevant libraries like Pandas and NumPy.

Libraries in Python

The discussion highlights that pre-written codes in libraries simplify tasks such as creating bar charts using Matplotlib, which allows users to pass parameters easily.

Statistics module usage was mentioned, where functions like mean can be applied without needing to learn complex formulas; data is passed as arguments to get results directly.

Understanding Data Manipulation with Pandas and NumPy

Features of Pandas and NumPy

These libraries are essential for data manipulation, allowing storage of data in arrays (NumPy) and creation of DataFrames (Pandas) for organizing data efficiently.

Functions within these libraries enable analysis by accessing individual rows and columns, checking dimensions, and identifying missing values in datasets.

Importance in Data Science

The use of Pandas and NumPy is crucial in data science due to their ability to handle large datasets effectively, including importing/exporting CSV files and performing various analyses.

A question may arise regarding the full form of NumPy—Numerical Python—which emphasizes its role in numerical computing applications. Understanding this can be beneficial during viva sessions.

Working with Arrays in NumPy

Array Operations

NumPy supports operations on arrays such as addition or subtraction between different arrays containing student marks across subjects (e.g., Math and AI). This facilitates easy calculations for total scores.

It provides an n-dimensional array structure that stores values efficiently; understanding how these structures work is fundamental for effective programming practices.

Types of Arrays

Arrays can be one-dimensional (1D), two-dimensional (2D), or even higher dimensions; each type has specific indexing methods that allow access to elements based on their position within the structure. For example, a 1D array starts indexing from zero.

Homogeneous nature of arrays means all elements must be of the same data type; mixing types will result in errors when attempting to store them together within a single array structure.

Creating One-Dimensional Arrays

Implementation Details

To create a one-dimensional array, you define it using a linear structure that holds a sequence of elements all sharing the same data type accessed via a single index starting from zero (e.g., a = [1, 2, 3]).

Accessing elements involves referencing their index position; printing an element at index zero would display its value directly (e.g., printing a yields 1).

Understanding Array Objects in NumPy

Introduction to Array Objects

The array object in NumPy is referred to as an "ndarray," which can have multiple dimensions.

The number of dimensions of the array is called the "rank" of the array; for example, a three-dimensional array has a rank of three.

Creating Arrays with Rank One

To create a one-dimensional array, the NumPy library must be imported using import numpy as np.

The keyword import is used to include any library, such as pandas or NumPy, into your code.

An array object can be created by assigning it to a variable and using the array function with a list argument.

Displaying Array Values

After creating an array named arr, its values (e.g., 1, 2, 3, 4, 5) can be printed along with a message indicating its rank.

A message can be displayed alongside the output by using double quotes for strings and separating them from variables with commas.

Creating Two-Dimensional Arrays

Methodology for Creating 2D Arrays

Two lists are utilized to create a two-dimensional array; for instance, [1, 2, 3] and [4, 5, 6].

This results in a matrix structure where data is organized into rows and columns.

Understanding Matrices

A matrix is defined as a two-dimensional array that organizes elements effectively into grid-like structures.

Essentially, it consists of arrays within arrays (an "array of arrays"), allowing complex data organization.

Using Tuples in Array Creation

Characteristics of Tuples

Unlike lists that allow updates and appending items, tuples are immutable once created.

Elements within tuples can still be accessed via index values but cannot be modified or extended after creation.

Introduction to Pandas Library

Overview of Pandas Functionality

The full form of pandas is "Panel Data," which refers to storing observations across different entities.

Pandas facilitates loading datasets and displaying summary statistics while enabling group-wise analysis for performance evaluation.

Data Structures in Pandas

Pandas primarily provides two data structures: Series (one-dimensional labeled arrays capable of holding various data types), and DataFrame (two-dimensional labeled data structure).

Creating Series and DataFrames in Pandas

Introduction to Series

The concept of creating a series in Python using indices labeled as 0, 1, 2, 3. A series can be created from scalar values or data items.

To create a series, the series function from the Pandas library is used similarly to how arrays were created with NumPy.

Using the Series Function

A variable named my_var is introduced as a series object by passing a list (e.g., [1, 7, 2]) into the pd.Series() function.

The output of printing my_var shows both the values and their corresponding indices (0, 1, 2), demonstrating how elements can be accessed via these indices.

Limitations of Series

It is noted that a series can only store one-dimensional data. For example, if we want to store marks for multiple subjects for students, this cannot be done effectively with a series.

Instead of using a series for multi-dimensional data storage (like multiple subjects), it’s recommended to use DataFrames in Pandas.

Understanding DataFrames

A DataFrame is described as a two-dimensional data structure that organizes data in rows and columns similar to matrices or tables.

The creation of DataFrames will involve methods such as creating them from NumPy arrays or dictionaries containing lists.

Methods for Creating DataFrames

Two methods are discussed:

First method involves creating a DataFrame using NumPy arrays.

Second method involves creating it from dictionaries where keys become column names and values become column entries.

Important Considerations When Creating DataFrames

When using NumPy arrays to create DataFrames:

Each array corresponds to a row; thus understanding this mapping is crucial when structuring your data correctly.

Example Creation Process

An example illustrates how three rows and four columns can be structured through an array setup.

In dictionary-based creation:

Keys represent column names while their associated lists represent the values under those columns.

Common Mistakes in Structuring DataFrames

Emphasis on avoiding confusion between rows and columns when defining structures; incorrect assignments could lead to errors during execution.

Final Steps in Creating DataFrame Objects

The process concludes with ensuring proper imports (e.g., importing pandas as pd).

After setting up arrays/lists correctly within the DataFrame() method call, users can define column headers if necessary.

DataFrame Creation Techniques in Pandas

Using the index Attribute

The index attribute of a DataFrame is essential for defining row labels, while column headers can be assigned using the columns attribute.

When creating an array, if you need to define index values explicitly, you can use the index attribute to set them accordingly.

Assigning Column Headers

Column headers can be assigned as lists or tuples. This flexibility allows for various data structures when defining DataFrame columns.

Both lists and tuples can also be used for index values, providing versatility in how data is structured within a DataFrame.

Creating DataFrames from Dictionaries

A common method for creating a DataFrame involves passing a dictionary of lists or arrays. Each key-value pair represents column names and their corresponding data.

The syntax requires using curly braces `` to denote key-value pairs, where keys serve as column names and values are the associated data.

Handling Missing Values

When constructing a DataFrame from dictionaries, missing values will appear as NaN (Not a Number). This behavior is consistent across different methods of creation.

If no row labels are specified during creation, default indexing (0, 1, 2...) will apply unless defined otherwise with the index parameter.

Creating DataFrames from Lists of Dictionaries

Another approach involves creating a DataFrame from a list of dictionaries. Each dictionary corresponds to one row in the resulting DataFrame.

You can assign series directly within this structure; each series acts like an individual column with its own index.

Adding New Columns to Existing DataFrames

To add new columns to an existing DataFrame, reference the DataFrame name followed by square brackets containing the new column name and assign it values directly.

For example, adding a new column named "Fatima" would involve specifying its name in square brackets and assigning it appropriate values.

This structured overview captures essential techniques for working with Pandas' DataFrames based on the provided transcript.

Data Manipulation in Python: Inserting and Deleting Rows and Columns

Inserting New Rows into a DataFrame

To add a new row to a DataFrame, use the syntax dataframe_name.loc[] where you specify the index for the new row.

If there are incorrect values (e.g., marks), you can modify them by accessing the specific row and column using dataframe_name.at[].

Deleting Rows and Columns from a DataFrame

The method dataframe_name.drop() is used to delete rows or columns. The parameter x = 0 indicates that rows will be deleted, while x = 1 indicates columns.

For deleting multiple rows, list their indices within square brackets and set x = 0. This allows for efficient removal of several rows at once.

Accessing DataFrame Properties

Key properties of a DataFrame include its index, which displays row labels. Use dataframe.index to access these labels.

To get column names, use dataframe.columns, while the shape of the DataFrame (number of rows and columns) can be accessed with dataframe.shape.

Displaying Portions of a DataFrame

The functions .head(n) and .tail(n) display the first or last 'n' rows respectively. By default, they show five rows if no argument is provided.

Understanding CSV Files

CSV stands for Comma-Separated Values; it is commonly used for storing tabular data in plain text format where each line represents a row.

CSV files are easy to read/write for both humans and computers, making them ideal for data storage. They allow straightforward import/export operations between applications like Notepad or Excel.

Importance of CSV Files in Data Analysis

CSV files play a crucial role in data analysis as they facilitate changes and manipulations on datasets stored within them.

Using libraries like Pandas allows users to load data from CSV files into DataFrames efficiently, enabling complex operations on structured data.

This markdown file summarizes key concepts related to manipulating data within Python's Pandas library as discussed in the transcript. Each section provides insights into practical applications such as inserting/deleting data, accessing properties, displaying portions of datasets, understanding CSV formats, and their significance in data analysis.

Creating and Handling CSV Files in Google APIs

Steps to Create a CSV File

Begin by creating a student CSV file using Google APIs. Input your data into an Excel sheet, then save it as a CSV file.

Navigate to the "File" menu, select "Save As," and choose the comma-delimited format for saving. Name the file "student_mod.csv" for consistency with future tasks.

Uploading and Accessing the CSV File

Open Google APIs and click on the folder icon to upload your newly created CSV file. Ensure that it appears in your list of files after successful upload.

Use the provided code from your reference book (import pandas as pd) to read data from the uploaded CSV file into a DataFrame.

Exporting Data from DataFrame

After reading data into a DataFrame (stored in 'df'), you can export this data back into another CSV file named "result_add.csv" without including index values.

Handling Missing Values

Identifying Missing Values

Discusses how to handle missing values within a DataFrame. If any value is missing due to various reasons, functions are available to check for these gaps.

To eliminate rows with missing values, use specific features that allow dropping such rows from your dataset.

Strategies for Managing Missing Values

You can either drop rows containing missing values or estimate them based on other available data points.

The isnull() function checks if there are any missing values in your DataFrame, returning True or False accordingly.

Dropping Rows with Missing Values

Utilize dropna() function which removes all rows containing any NaN (missing value), resulting in a cleaner dataset suitable for analysis.

Estimating Missing Values

The fillna() function allows you to replace NaN values with specified estimates or averages derived from surrounding data points.

Practical Application of Functions

Checking Specific Columns for Missing Values

You can check specific columns for missing values by referencing column names followed by .isnull().any(), which indicates whether any NaN exists in that column.

Conclusion on Handling Missing Values

Understanding how to effectively manage missing values is crucial; functions like dropna() and fillna() are essential tools in ensuring robust data analysis processes.

Data Analysis Techniques for Missing Values

Understanding Missing Values in Data Frames

The speaker discusses the use of functions to identify missing values within a specific row of a data frame, emphasizing the importance of understanding data completeness.

A method is introduced to calculate the total number of missing values across an entire data frame using summation functions, highlighting its utility in data analysis.

The explanation includes practical steps on how to implement these functions effectively, ensuring clarity for viewers who may be new to data manipulation techniques.

The speaker expresses hope that viewers will find value in the video content, indicating a focus on educational outcomes and viewer engagement.