Introduction
Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available dataset of dermatoscopic images. We collected dermatoscopic images from different populations, acquired and stored by different modalities. The final dataset consists of 10015 dermatoscopic images which can serve as a training set for academic machine learning purposes. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions: Actinic keratoses and intraepithelial carcinoma / Bowen's disease (akiec), basal cell carcinoma (bcc), benign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses, bkl), dermatofibroma (df), melanoma (mel), melanocytic nevi (nv) and vascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage, vasc).
Prerequisites for Use
All of the data will need to be placed in directory:
hvm-image-clf/data
First we will need to download 3 data files used by the package. The first two can be downloaded by clicking the links below.
Training Data - Unzip and place the folder named "ISIC2018_TASK3_Training_Input" and place in the data directory. This contains all of the training images.
Ground Truth - Unzip and place the file named "ISIC2018_Task3_Training_GroundTruth.csv". This contains all of our ground truth labels of the training images.
The third dataset can be downloaded by following this link Metadata. This will bring you to a website called Harvard Dataverse. On the page you will be able to see dropdown box called "Access File". Select the option called "Comma Separated Values (Original File Format) to download the dataset.
Project dependencies can be installed using
pip install -r requirements.txt
Running the notebook.
If all of the pre-requisites are setup correctly, the notebook file can be run using an IPython or Anaconda distributed notebook ide.