Optical Recognition of Handwritten Digits Dataset
The load_digits function from scikit-learn loads a dataset of 1,797 samples of 8×8 images of handwritten digits, useful for practicing image classification techniques in machine learning with 10 class labels (0-9).
Classes |
10 |
Samples per class |
~180 |
Samples total |
1797 |
Dimensionality |
64 |
Features | integers 0-16 |
Optical recognition of handwritten digits dataset Examples:
from sklearn.datasets import load_digits
import pandas as pd
# Load the digits dataset
digits = load_digits()
# Creating a DataFrame from the dataset for easier manipulation
digits_df = pd.DataFrame(data=digits.data)
digits_df['target'] = digits.target
# Adding column names for better readability
digits_df.columns = [f'pixel_{i}' for i in range(digits.data.shape[1])] + ['target']
# Print the first few rows of the DataFrame
print(digits_df.head())
Output:
pixel_0 pixel_1 pixel_2 pixel_3 pixel_4 pixel_5 pixel_6 pixel_7 \
0 0.0 0.0 5.0 13.0 9.0 1.0 0.0 0.0
1 0.0 0.0 0.0 12.0 13.0 5.0 0.0 0.0
2 0.0 0.0 0.0 4.0 15.0 12.0 0.0 0.0
3 0.0 0.0 7.0 15.0 13.0 1.0 0.0 0.0
4 0.0 0.0 0.0 1.0 11.0 0.0 0.0 0.0
pixel_8 pixel_9 ... pixel_55 pixel_56 pixel_57 pixel_58 pixel_59 \
0 0.0 0.0 ... 0.0 0.0 0.0 6.0 13.0
1 0.0 0.0 ... 0.0 0.0 0.0 0.0 11.0
2 0.0 0.0 ... 0.0 0.0 0.0 0.0 3.0
3 0.0 8.0 ... 0.0 0.0 0.0 7.0 13.0
4 0.0 0.0 ... 0.0 0.0 0.0 0.0 2.0
pixel_60 pixel_61 pixel_62 pixel_63 target
0 10.0 0.0 0.0 0.0 0
1 16.0 10.0 0.0 0.0 1
2 11.0 16.0 9.0 0.0 2
3 13.0 9.0 0.0 0.0 3
4 16.0 4.0 0.0 0.0 4
[5 rows x 65 columns]
What is Toy Dataset – Types, Purpose, Benefits and Application
Toy datasets are small, simple datasets commonly used in the field of machine learning for training, testing, and demonstrating algorithms. These datasets are typically clean, well-organized, and structured in a way that makes them easy to use for instructional purposes, reducing the complexities associated with real-world data processing.