Setting up your first project#

Downloading and Organizing Files for Your Zoonyper Project#

To set up your Zoonyper project, you’ll need to download all the necessary files and store them in the same folder. Here’s how to do it:

  1. Go to the Lab page of your Zooniverse project and navigate to the “Data Exports” section.

  2. Download the following files:

    • Classifications (as a CSV file)

    • Subjects (as a CSV file)

    • Workflows (as a CSV file)

    • Talk comments (as a JSON file)

    • Tags (as a JSON file)

    Note: Be sure to select the correct files and formats, and avoid downloading the “workflow classifications” file.

  3. Once the files are downloaded, move them to a new folder. Rename them as necessary so that they match these specific names:

    • classifications.csv for the classifications file

    • subjects.csv for the subjects file

    • workflows.csv for the workflows file

    • comments.json for the talk comments file

    • tags.json for the tags file

    (Note: Remember that the last two files are JSON files and not CSV files.)

By following these steps, you’ll have all the necessary files organized and ready to use for your Zoonyper project.

Initializing a Project with the Downloaded Files#

Once you’ve downloaded and organized the necessary files for your Zoonyper project (in the previous step), you can initiate a new Project in Python. Here’s how to do it:

  1. Open a new Python script or Jupyter notebook and import the Zoonyper library:

    from zoonyper import Project
    
  2. Specify the directory path where you stored the downloaded files:

    project = Project("path/to/input-directory")
    

    (Replace "path/to/input-directory" with the actual path to your directory.)

    This creates a new Project object that will contain all the necessary data from the downloaded files.

That’s it! You’ve now successfully initialized a Zoonyper project with your downloaded files.

Note

If you are interested in alternative ways to set up a project, check out the Loading a Project tutorial. (The method shown here is equivalent to “Option 2: Specifying directory with required files”)

Disambiguating subjects (Optional)#

To avoid ambiguous classifications and consolidate all classifications per actual subject (rather than the subjects uploaded to Zooniverse), you can perform a process called disambiguation on the downloaded subjects. Disambiguation involves downloading each subject image and extracting a unique identifier for each one, which Zoonyper can use to group identical subjects together.

To disambiguate the subjects in your Zoonyper project, follow these steps:

  1. Create a new folder to store the subject image files:

    $ mkdir input-directory/downloads/
    
  2. Now, download all the subject files from your project:

    project.download_all_subjects(sleep=(0, 1), organize_by_workflow=False, organize_by_subject_id=False)
    

    Note that this step will take some time as you will have to download every single subject processed in your project. Depending on how many subjects you have across all your workflows, it may take several hours.

    By setting the sleep=(0, 1) parameter, we allow the method to wait a random number of seconds (between 0 and 1 in the example) in-between each download. If you keep running into timeout errors, you can increase these numbers to see if it helps.

    Setting organize_by_workflow=False and organize_by_subject_id=False will organize the downloaded files as a flat structure in the downloads folder.

  3. Next, call the .disambiguate_subjects() method on your Project

  4. object and pass in the download directory as its argument:

    project.disambiguate_subjects()
    

    This method will download each subject image and extract its unique identifier, which will be stored in the project’s metadata. Note that this process may take some time depending on the number of subjects in your project.

That’s it! You’ve now successfully disambiguated the subjects in your Zoonyper project.

Finishing Up#

Congratulations, you’ve successfully set up and initialized a Zoonyper project with your downloaded files! Here are a couple of final tips to help you get started:

  • Access the project’s subjects and classifications as Pandas DataFrames:

    project.subjects
    project.classifications
    

    These two DataFrames contain all the information you need to start analyzing and visualizing your project data.

  • Check out the Zoonyper documentation and examples for more ideas on how to use the library. Here are a few topics to get you started:

    • Working with workflows and tasks

    • Filtering and grouping classifications

    • Creating visualizations and summary statistics

    • Exporting data in various formats