E-Learn Knowledge Base


Vsasf Tech ICT Academy, Enugu in early 2025 introduced a hybrid learning system that is flexible for all her courses offered to the general public. With E-learn platform powered by Vsasf Nig Ltd, all students can continue learning from far distance irrespective of one's location, hence promoting ODL system of education for Nigerians and the world at large.

Students are encouraged to continue learning online after fully registered through the academy's registration portal. All fully registered students with training fee payment completed can click on the login link Login to continue to access their course materials online

 Vectors for Data Science in Python 

With the democratization of AI/ML and open source libraries like Keras, scikit-learn etc, anyone with basic python knowledge can set up a working ML classifier in under 5 mins time. While this is more than enough to get started, if you want to understand how different ML algorithms work or implement the latest SOTA (State of the Art) papers to your particular domain, the lack of mathematical expertise quickly becomes a bottleneck as I have experienced firsthand.

In this set of articles, I would try to introduce fundamental mathematics concepts one at a time for non math audience and show it’s practical use in ML / AI domain.

We start off with the simplest of the lot, vectors.

Vectors are simply quantities with direction. A few relatable real world example of vectors are Force, velocity, displacement etc.

For moving a shopping cart, you need to push (apply force) in the direction you want to move the cart. The force expended by you in moving the cart can be described fully by two values, the intensity (magnitude) of the push and the direction you pushed the cart. Any such quantity which requires both magnitude and direction to describe completely is called a vector.

Vectors are usually represented as bold lower case characters like v, w etc. **** Since writing boldface characters using pen and paper is difficult, it’s also represented with an arrow on top of lower case characters when using pen and paper. For this article, we will stick with boldface representation.

Graphically vectors are represented as arrows whose length signify the magnitude (intensity) of the vector and whose angle (from a frame of reference; in this case horizontal )represent the direction of the vector as shown below.

Image by Author
Image by Author

Please note that it is not required that the vector should start from origin (0,0). They can start from any point. For e.g. In the above diagram, u = w and v = a since they have same magnitude and same direction.

There are many ways to represent vectors mathematically. As Data scientists, the one we are interested in is to represent them as a tuple of numbers. Thus vector u can be represented as (2,2) while vector v can be represented as (4, 1). Same holds true for vector w and a.

Though it is easy for us to visualize vectors in 2 and 3 dimensions, the concept of vectors is not limited to 2 and 3 dimensions. It can be generalized to any number of dimensions and this is what makes vectors so useful in Machine Learning.

For e.g. c = (2,1,0) represent a vector in 3 dimensional space while d = (2,1,3,4) represent a vector in 4 dimensional space. As humans, though we cannot visualize dimensions higher than 3, mathematical way of representing vectors gives us the ability to perform operations on higher dimensional vector space.

By now, you must be bored and wondering why as a ML enthusiast you need to learn elementary physics and vectors. Turns out vectors have multiple applications in machine learning from building recommendation engines, to numerical representation of words for Natural Language processing etc and forms the base for all Deep Learning models for NLP.

Let’s start with a code example of how vectors are implemented in numpy and tensorlfow.

Please note: Full code is made available as gist on last section. Relevant subsections are inserted as pictures for illustration purpose.

Image by Author
Image by Author

Below you can see a simple word2vec implementation to show practical use of vectors in Natural Language Processing

Image by Author
Image by Author

As you can see, storing words in higher dimensional vector format is one of the main applications of vectors in Natural Language processing. This type of embedding preserves the context of the word.

In next section, we will go through basic operations like addition and subtraction and how it applies on vectors.


Vector Addition

Now that we have defined what a vector is, let’s find out how to perform basic arithmetic operations on them.

Let’s take the same two vectors u and v and perform a vector addition on them.

Image by Author
Image by Author

To add two vectors u and v graphically, we move the vector v such that it’s tail starts at the head of vector u as shown above (Lines DE and EF ). The sum of two vectors is the vector b that starts at tail of u and ends at the head of v (Line DF).

For better intuition, let’s take a real world example of driving to grocery shop. On the way, you stopped at gas station to fill gas. Let’s assume vector u represents how far the gas station (Point E ) is from your home (point D). If vector u represent the distance (displacement) from gas station to grocery store, then vector b, drawn from the tail of u to the head of v represent the sum u + v. It **** represents how far the grocery store (Point F) is from your home (initial starting point D).

Mathematically b can be represented as (6, 3) by looking at the graph.

There are other methods of calculating vector addition graphically like parallelogram method which you can explore on your own.

Now it’s not possible to plot a graph every-time we want to do vector arithmetic especially when it comes to higher dimension vectors. Fortunately, the mathematical representation of vectors provide us an easy way of doing vector addition.

Since each vector is a tuple of numbers, let’s see what we get if we add the corresponding numbers of each vector.

In the example above, u = (2,2) v = (4,1) b = (2+4, 2+1) = (6,3) which is same as the solution obtained graphically.

Thus vector addition can be done by simply adding corresponding elements of each vector and as you might have already inferred, only vectors having same dimensions can be added together. Let’s see a code implementation

Image by Author
Image by Author

Vector Subtraction

Before we move on to vector subtraction, let’s take a quick look at scalar multiplication another useful property of vectors.

Scalar is nothing but a quantity with only magnitude and no direction. e.g. any integer is a scalar. A real world example of scalar quantity is mass (weight), height of a person etc.

Let’s see what happens when we multiply a vector with a scalar quantity.

u = (2,2) v = (4,1)

If we want to multiply u with a scalar quantity C = 3, one intuitive way to look at it would be to multiply the individual numbers within vector u (2,2) with 3. Let’s see how that looks

d = C x u = 3 x u = (3 x 2, 3 x 2) = (6, 6)

Let’s plot the vectors on a graph and see.

Image by Author
Image by Author

As you can see, multiplying a vector u with a positive scalar value results in a new vector d in same direction, but with magnitude scaled by a factor C = 3

Let’s try multiplying a vector with negative value C = -1

e = C x v = -1 x v = (-1 x 4, -1 x 1) = (-4, -1).

Let’s plot and see how that looks like.

Image by Author
Image by Author

As you can see, multiplying a vector v by -1 results in a vector with same magnitude, but in opposite direction which can be represented as

e = -v or e + v = 0 (Null vector)

Vector subtraction graphically can be considered as a special case of vector addition where u -v = u + -v

Solving graphically

Image by Author
Image by Author

w = u a = -v b = w + a = u + -v = u -v = (-2, 1)

As is evident from graph, b = c or vector subtraction u -v is equal to vector c drawn from head of v to head of u (distance between heads)

Intuitively, this makes sense and is also consistent with traditional number system.

7 -5 = 2 (where 2 is the quantity when added to 5 gives 7) 5 + 2 = 7

Similarly, if you look at the graph, c is the vector which when added to v gives the vector v u = v + c

Now lets do do this mathematically by subtracting individual components within two vectors. c = u -v = (2,2) -(4,1) = (2 -4, 2–1) = (-2,1)

The result is same as the graphical method of solving. A code example is given below for reference.

Image by Author
Image by Author
Authors: Kishore Ramakrishnan
Register for this course: Enrol Now

Introduction to Vectors for Data Science

Vectors in Data Science tell the properties of a data point in different dimensions. Different components of a data point forms a vector each component is related to one dimension.

Point/Vector P in 2-D space

In the above image point P is a vector in 2 Dimension space with x1 and x2 component or (x,y component).

(0,0) represents the origin.

Point/Vector Q in 3-D space

In the above image point Q is a vector in 3-D space with x1,x2,x3 components. A mosquito at one position in a square room is like a data point in 3-D space 😄.

Similarly we can have N-Diminsion in a vector but it’s hard to plot N-D vector on a 2D surface. A vector in N-D will look like this V = [x1,x2,x3,……,xN]

 

Distance of a point from origin:

Let’s see how we can calculate distance of a point from origin in a space.

Distance of a point from origin in 2D, 3D and N-D space

In the above image we have computed distance of three points from origin, point P in 2D, point Q in 3D and point X in N-D. We can use simple Pythagoras theorem to compute the distance a point from origin.

 

Distance between two points:

Let’s see how we can calculate distance between two points in a space.

Distance between two points in 2D space

In the above image, we are calculating the distance “d” between two points “P” and “Q”. Calculation of distance between two points in a space is similar to calculating the distance of a point from origin. You can consider in the above example if let, point Q is origin then the co-ordinates of point Q would be (0,0) and the same formula will get converted to the previous formula we used to find distance of point from origin.

Distance between two points in 3D and N-D space
 

Types of Vector Representation:

There are two types of vector representation,

Row Vector:

A row vector has one row and n columns.

Row Vector Representation

Column Vector:

A column vector has one column and n rows.

Column Vector Representation
 

Addition of two vector:

Addition of two vectors A & B

In the above image we can see how to add two vectors.

Multiplication of two vector:

There are two type of multiplication we can perform on vectors, Dot Product and Cross Product, For Data Science related study Cross Product is not used frequently so we will focus on Dot Product.

Transpose:

Before performing the dot product on two vectors perform transpose operation on one of the vector iff both the vectors are of same representation e.g. both the vectors are row vectors. Transpose of a vector converts the row vector to column vector and column vector to row vector.

Transpose of vector A

In the above image vector A^T is transpose of vector A.

Dot Product:

We represent the dot product of two vectors by putting a dot between the vectors e.g. (A . B).

Note: For performing the dot product between two vectors number of column in vector 1 and number of row in vector 2 should be same. Which means dimension of both the vectors should be same. Before performing the dot product perform the transpose operation on one of the vector iff both the vectors are of same representation e.g. both the vectors are row vectors.

Dot Product of vector A and B
 

Geometric Intuition Behind Dot Product:

Now since we have learned what is dot product and how to compute it, let’s see what is the geometric intuition behind it so that we can connect the dots.

A . B = ||A|| ||B|| Cos θ

Above equation also calculate the dot product of two vectors A and B this equation can be used to calculate the angle between two vectors. Here ||A|| represents the length of the vector A and θ represents the angle between vector A and B.

Now let’s see how to calculate the angle between two vectors.

Angle between two vectors A and B

In the above image we can see how easily we can compute the angle between two vectors. Now let’s look at one interesting case.

What if the dot product between two vectors is zero?

Vectors Perpendicular to Each Other

In the above image we can see that if the dot product of two vectors is zero, both the vectors are perpendicular to each other.

 

Projection of Vector:

Projection of one vector on another vector is like throwing light on one vector and projecting it’s shadow on another vector. Let’s see how to get projection of one vector on another vector.

Projection of Vector A on B

In the above image AB is the projection of vector A on vector B. Just assume that a light is coming from the bulb above vector A and the shadow of A is getting projected on B.

 

Unit Vector:

A unit vector is represented by hat on top of vector. It represents the single unit of a vector.

A unit vector always has the same direction as the vector.

Length of the unit vector is 1, ||A^|| = 1

Unit Vector of Vector A
Authors: T. C. Okenna
Register for this course: Enrol Now

Linear Algebra for Data Science

Linear algebra is the branch of mathematics that deals with vectors, vector spaces, and linear transformations. Linear Algebra in data science offers essential tools for interacting with data in numerous approaches, understanding relationships between variables, performing dimensionality reduction, and solving systems of equations. Linear algebra techniques, including matrix operations and eigenvalue decomposition, are typically used for tasks like regression, clustering, and machine learning algorithms.

Importance of Linear Algebra in Data Science

Linear algebra in data science is important because of its crucial role in numerous sector components. 

  • It forms the backbone of machine learning algorithms, enabling operations like matrix multiplication, which are essential to model training and prediction.
  • Linear algebra techniques facilitate dimensionality reduction, enhancing the performance of data processing and interpretation.
  • Eigenvalues and eigenvectors help understand data records variability, influencing clustering and pattern recognition. 
  • Solving systems of equations is crucial for optimization tasks and parameter estimation. 
  • Furthermore, linear algebra supports image and signal processing strategies critical in data analysis.
  • Proficiency in linear algebra empowers data scientists to successfully represent, control, and extract insights from data, in the end driving the development of accurate models and informed decision-making.

Representation of Problems in Linear Algebra

In linear algebra, problems can frequently be represented and solved using matrices and vectors. 

  • Many real-world situations can be translated into linear equations and converted right into a matrix structure. 
  • Additionally, problems related to transformations, scaling, rotation, and projection, can be depicted using matrices.
  • Data units can be represented as matrices, in which every row corresponds to an observation and each column corresponds to a characteristic.
  • Eigenvalues and eigenvectors offer insights into dominant patterns and adjustments inside data, assisting in tasks like dimensionality reduction and understanding variability.
  • The usage of matrix operations can solve linear regression problems to discover optimal coefficients. 
  • Classification problems can also be tackled using linear algebra strategies like support vector machines, which involve mapping statistics into higher-dimensional spaces.

How is Linear Algebra used in Data Science?

Linear algebra in data science is considerably used for numerous tasks and strategies:

  • Data Representation: Data sets are often represented as matrices, wherein every row corresponds to an observation and every column represents a function. This matrix illustration permits efficient manipulation and data analysis.
  • Matrix Operations: Basic matrix operations like addition, multiplication, and transposition are used for numerous calculations, such as computing similarity measures, remodeling data, and solving equations.
  • Dimensionality Reduction:  Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) methods rely on principles from linear algebra to decrease the complexity of data while retaining critical information.
  • Linear Regression: Linear algebra is the base of linear regression, a widely used technique for modeling relationships between variables and depicting predictions.
  • Machine Learning Algorithms: Algorithms like support vector machines, linear discriminant evaluation, and logistic regression utilize linear algebra operations to build models and classify information.
  • Image and Signal Processing: Linear algebra strategies are vital in image processing responsibilities like filtering, compression, and edge detection. Fourier transforms, and convolutions contain linear algebra operations as well.
  • Optimization: Linear algebra is important for optimization algorithms utilized in machine learning, including gradient descent, based on calculating gradients.
  • Eigenvalues and Eigenvectors: These concepts assist in identifying dominant patterns and directions of variability in data, useful in clustering, feature extraction, and expert data characteristics.
  • Data Visualization: Dimensionality reduction techniques supplied through linear algebra, such as PCA, help visualize high-dimensional information in low-dimensional areas.
  • Solving Equations: Utilizing linear algebra techniques is a common approach to solving sets of linear equations, which emerge in scenarios involving optimization problems and the estimation of parameters.
Authors: T. C. Okenna
Register for this course: Enrol Now

OSEMN Methodology

 
Authors: NK Skills
Register for this course: Enrol Now

CRISP-DM Methodology:

https://youtu.be/q_okDS2RtzY

Phases in CRISP-DM Methodology

https://youtu.be/NinRBxDVdnM

Authors: Institute of Product Leadership,
Register for this course: Enrol Now
Page 1 of 3