Skip to content
Snippets Groups Projects
Select Git revision
  • main default
  • 7-create-function-to-batch-shuffle-training-data-2
  • 9-principal-component-analysis
  • 7-create-function-to-batch-shuffle-training-data
4 results

hvm-image-clf

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Philip Monaco authored
    6591fcc3
    History

    Introduction

    Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available dataset of dermatoscopic images. We collected dermatoscopic images from different populations, acquired and stored by different modalities. The final dataset consists of 10015 dermatoscopic images which can serve as a training set for academic machine learning purposes. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions: Actinic keratoses and intraepithelial carcinoma / Bowen's disease (akiec), basal cell carcinoma (bcc), benign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses, bkl), dermatofibroma (df), melanoma (mel), melanocytic nevi (nv) and vascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage, vasc).

    Prerequisites for Use

    All of the data will need to be placed in directory:

    hvm-image-clf/data

    First we will need to download 3 data files used by the package. The first two can be downloaded by clicking the links below.

    Training Data - Unzip and place the folder named "ISIC2018_TASK3_Training_Input" and place in the data directory. This contains all of the training images.

    Ground Truth - Unzip and place the file named "ISIC2018_Task3_Training_GroundTruth.csv". This contains all of our ground truth labels of the training images.

    The third dataset can be downloaded by following this link Metadata. This will bring you to a website called Harvard Dataverse. On the page you will be able to see dropdown box called "Access File". Select the option called "Comma Separated Values (Original File Format) to download the dataset.

    Project dependencies can be installed using

    pip install -r requirements.txt

    Running the notebook.

    If all of the pre-requisites are setup correctly, the notebook file can be run using an IPython or Anaconda distributed notebook ide.