versioning-data | module:

Find a Dataset on Github and Clone it

Goals (Learning Objectives)

After doing this tutorial you will know how to

Activities

Step: Find a Dataset on Github

Find a dataset on github that contains CSV files.

Here are some lists of datasets you could choose from: * https://github.com/datasets/ * https://github.com/caesar0301/awesome-public-datasets

Careful - not all datasets are CSV. You want CSV files for this exercise.

Step: Fork the dataset

Fork the dataset into your github account.

Step: Clone the Dataset to your Local Machine

Clone the git repository in the same way that you would clone any git repository

  1. Get the git clone url for your fork of the dataset (ie. git@github.com:yourgithubname/datasetname)
  2. Clone the Dataset to your Local Machine using the git clone url git clone git@github.com:yourgithubname/datasetname

Next Steps

Next, Import the dataset into Noms