Distance Between Two Points In Two Different Dataframes In Pandas

Let us see a few different approaches to finding the distance between two points in two different DataFrames in Python.

Using Euclidean Distance Formula

The Euclidean distance formula is the most used distance metric and it is simply a straight line distance between two points. To find the distance between corresponding points in two DataFrames using this method, just calculate the square root of the sum of the squared differences between the X and Y coordinates.

Example: In this example, we first import the Pandas and Nympy modules. Then create two dataframes with some data in it. Then the Euclidean formula is applied to take the square root of the sum of the squared differences of X and Y coordinates using Numpy’s sqrt() function.

Python
# import pandas and numpy
import pandas as pd
import numpy as np

# dataframe 1
df1 = pd.DataFrame({
    'Name': ['w3wiki', 'CodingForAll', 'CodeWars'],
    'X': [1, 4, 5],
    'Y': [2, 5, 1]
})

# dataframe 2
df2 = pd.DataFrame({
    'Name': ['w3wiki', 'CodingForAll', 'CodeWars'],
    'X': [2, 3, 6],
    'Y': [3, 2, 4]
})

# euclidean formula
res = np.sqrt((df1['X'] - df2['X'])**2 + (df1['Y'] - df2['Y'])**2)
print(res)

Output:

0    1.414214
1 3.162278
2 3.162278
dtype: float64

Using Numpy’s Linalg Norm

The numpy.linalg.norm() method returns one of eight possible matrix norms or an infinite number of vector norms. The infinity norm is the maximum row sum, which is used to calculate the Euclidean distance between corresponding points in two DataFrames. By taking the difference between the X and Y coordinates of the points and applying the norm function along the specified axis, we obtain the distances between each pair of points.

Example: In this example, we first import the Pandas and Nympy modules. Then create two dataframes with some data in it. Then using the numpy.linalg.norm() function on the Euclidean formula we obtain the distance between two dataframe points.

Python
# import pandas and numpy
import pandas as pd
import numpy as np

# dataframe 1
df1 = pd.DataFrame({
    'Name': ['w3wiki', 'CodingForAll', 'CodeWars'],
    'X': [1, 4, 5],
    'Y': [2, 5, 1]
})

# dataframe 2
df2 = pd.DataFrame({
    'Name': ['w3wiki', 'CodingForAll', 'CodeWars'],
    'X': [2, 3, 6],
    'Y': [3, 2, 4]
})

# numpy linalg.norm() function
res = np.linalg.norm(df1[['X', 'Y']].values - df2[['X', 'Y']].values, axis=1)
print(res)

Output:

[1.41421356 3.16227766 3.16227766]


Distance Calculation Between Points in Different DataFrames Using Pandas

We are given two DataFrames in Pandas, each containing coordinates of points. We have to find the distance between corresponding points in these DataFrames in Python.

Similar Reads

Distance Between Two Points In Two Different Dataframes In Pandas

Let us see a few different approaches to finding the distance between two points in two different DataFrames in Python....