Skip to content
Snippets Groups Projects
Commit f5a1060e authored by pjm363 (Philip Monaco)'s avatar pjm363 (Philip Monaco)
Browse files

Merge branch '6-create-a-parsing-for-the-training-and-test-input-sets' into 'main'

Resolve "Create a parsing for the training and test input sets."

Closes #6

See merge request !1
parents f23488f7 5901d98f
No related branches found
No related tags found
1 merge request!1Resolve "Create a parsing for the training and test input sets."
# HVM Image Clf
# Introduction
# Prerequisites for Use
1. Download the training data [Here](https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task3_Training_Input.zip)
2. Download the training ground truth [Here](https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task3_Training_GroundTruth.zip)
3. Download the test data images [Here](https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task3_Validation_Input.zip)
4. Download the test data ground truth [Here](https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task3_Validation_GroundTruth.zip)
5. Unzip the files into the data folder in this repository.
\ No newline at end of file
import os
import cv2 #vision task package opencv-python
import pandas as pd
import glob
import numpy as np
def load_transform_images(folder):
"""[summary]
Args:
filename ([type]): [description]
"""
images = [cv2.imread(file, flags=cv2.IMREAD_GRAYSCALE) for file in glob.glob("./data/"+ folder+"/*.jpg")]
return images
def transform(data):
flat = []
df = pd.DataFrame()
for i,img in enumerate(data):
scale = (img.astype(np.float32) - 127.5)/127.5
scale = scale.reshape(1,-1)
df = df.append(pd.Series(scale[0]), ignore_index=True)
return df
# def batch_data(data):
%% Cell type:markdown id:441343d0-e422-4dae-b988-9130e5a0d565 tags:
## Packages
os: Operating system interface
OpenCV: cv2 `pip install opencv-python>=4.5.5`
%% Cell type:code id:d7e56e0e-7eec-429d-940b-c3337db4b4dc tags:
``` python
import os
import cv2 #vision task package opencv-python
import numpy as np
import pandas as pd
from data_processing import load_transform_images, transform
import matplotlib.pyplot as plt
```
%% Cell type:markdown id:67180bd5-c307-4d97-ac64-a6f7980ce32d tags:
## Load Data
%% Cell type:code id:9997df6a-7b70-4060-9e0c-a8c0ed6c3737 tags:
``` python
files = os.listdir('./data/ISIC2018_Task3_Training_Input')
len(files), files[:10]
```
%% Output
(10015,
['ISIC_0024306.jpg',
'ISIC_0024307.jpg',
'ISIC_0024308.jpg',
'ISIC_0024309.jpg',
'ISIC_0024310.jpg',
'ISIC_0024311.jpg',
'ISIC_0024312.jpg',
'ISIC_0024313.jpg',
'ISIC_0024314.jpg',
'ISIC_0024315.jpg'])
%% Cell type:markdown id:abf26e08-49e0-4f39-8b1e-66504dd70fbf tags:
# Image Processing and Transformation
%% Cell type:markdown id:6601df9a-0323-40e3-96f1-96c92895e4c9 tags:
### Sample conversion from a single image
%% Cell type:code id:35352bf4-6670-4de9-afe2-7efc51c2b151 tags:
%% Cell type:code id:9997df6a-7b70-4060-9e0c-a8c0ed6c3737 tags:
``` python
#Using imread to import the image as a Blue, Green, Red composit image
image_data_BGR = cv2.imread('./data/ISIC2018_Task3_Training_Input/ISIC_0024306.jpg', flags=cv2.IMREAD_COLOR)
training_data = load_transform_images("ISIC2018_Task3_Training_Input")
```
%% Cell type:markdown id:efdb6191-44b1-4c3e-8684-790958bf100a tags:
%% Cell type:markdown id:6601df9a-0323-40e3-96f1-96c92895e4c9 tags:
As we can see the image is converted to a composite of 3, 2 dimensional numpy arrays. The first array is the Blue values, the second array is the Green values, and the third array is the Red values.
### Sample conversion from a single image
%% Cell type:code id:561e73a8-6c36-4baf-b1e6-3d2e05b0004f tags:
%% Cell type:code id:35352bf4-6670-4de9-afe2-7efc51c2b151 tags:
``` python
image_data_BGR.shape
plt.imshow(training_data[0], cmap='gray')
plt.show()
```
%% Output
(450, 600, 3)
%% Cell type:code id:ac323419-fa86-4dd4-a979-ef6b0abcbc15 tags:
``` python
image_data_BW = cv2.imread('./data/ISIC2018_Task3_Training_Input/ISIC_0024306.jpg', flags=cv2.IMREAD_GRAYSCALE)
```
%% Cell type:code id:2cb18b6e-2918-42f6-b6c0-0aff0948abb1 tags:
``` python
df = pd.DataFrame(image_data_BW)
df.max()
```
%% Cell type:markdown id:4936bf4f-422e-44cd-9a0a-d95bf7a1a9fa tags:
%% Output
Comparison of the first reconstructed image and original image in the dataset.
0 171
1 171
2 172
3 171
4 177
...
595 176
596 177
597 177
598 177
599 178
Length: 600, dtype: uint8
<img src="./data/ISIC2018_Task3_Training_Input/ISIC_0024306.jpg" width=300>
%% Cell type:markdown id:4efb6881-8ed3-49a7-b834-f17266938100 tags:
%% Cell type:markdown id:2fd91bc2 tags:
Here we are converting the image into a grayscale and output a 2 dimensional numpy array of the black and white values.
### Flatten image
%% Cell type:code id:42ece6b6-581c-4023-97c9-ea1b9f916a70 tags:
%% Cell type:code id:561e73a8-6c36-4baf-b1e6-3d2e05b0004f tags:
``` python
print(image_data_BW)
df = transform(training_data[0:2])
df
```
%% Output
[[158 156 157 ... 163 163 164]
[157 158 161 ... 162 163 162]
[156 159 161 ... 162 163 164]
...
[146 147 149 ... 160 160 158]
[143 146 147 ... 161 163 162]
[143 146 146 ... 160 163 161]]
0 1 2 3 4 5 6 \
0 0.239216 0.223529 0.231373 0.239216 0.247059 0.247059 0.231373
1 0.184314 0.184314 0.200000 0.200000 0.192157 0.223529 0.200000
7 8 9 ... 269990 269991 269992 269993 \
0 0.278431 0.309804 0.278431 ... 0.317647 0.270588 0.247059 0.239216
1 0.200000 0.176471 0.184314 ... 0.207843 0.176471 0.239216 0.239216
269994 269995 269996 269997 269998 269999
0 0.254902 0.231373 0.231373 0.254902 0.278431 0.262745
1 0.247059 0.231373 0.223529 0.215686 0.254902 0.262745
[2 rows x 270000 columns]
%% Cell type:markdown id:4936bf4f-422e-44cd-9a0a-d95bf7a1a9fa tags:
The image we will turn black and white
<img src="./data/ISIC2018_Task3_Training_Input/ISIC_0024306.jpg" width=300>
%% Cell type:code id:8c686421-44bd-47ca-9d2b-0627477f0903 tags:
%% Cell type:code id:1e29869a tags:
``` python
plt.imshow(image_data_BW, cmap='gray')
plt.show()
```
%% Output
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment