Skip to content

NYU ITP 2019 Thesis. An interactive experience to see how machine interpret one thing differently from human

Notifications You must be signed in to change notification settings

pondjames007/LostInTranslation

Repository files navigation

Lost in Translation

NYU ITP 2019 Thesis
An interactive experience to see how machine interpret one thing differently from human.

Presentation Video in ITP Thesis Week 2019

Introduction

The project has a recursive process for human and machine to interpret each other’s results. Human needs to come up with a sentence to describe an image generated by machine and the machine will do multiple machine learning translations from the description from human to a sketch and then to an image in each round of process.

Inspiration

Telephone Game

An example of multiple translations
Drawception - Picture Telephone Drawing Game

Closed Loop

A project uses machine learning to do feedback loop on images and texts.
Jake Elwes - Closed Loop

Implementation

  • Python Server with Flask
  • Javascript Client
  • Generate a sentence from an image by im2txt
  • Find word tags and get nouns by SpaCy
  • Word Vector similarity by SpaCy
  • Draw doodles by SketchRNN
  • Generate new images by AttnGan

app.py

Server code
Coordinate and process most of the data.
Use http connection to communicate with Runway and Client.

static/client.js

Client Code
Present the result and collect user input.

categories.json

A Json file that store all sketch categories

draw_strokes.py

functions to draw sketch

drawSketch.py

a test function to draw sketch

im2txt

A machine learning model that can generate a sentence based on an image.
The model is originated from models/research/im2txt. A pre-trained model is provided in Runway.

SketchRNN

A machine learning model that can generate doodle in specific categories.
The doodle data is from Quick, Draw! The Data and the model detail is from Magenta - SketchRNN.
It is downloaded from Google Cloud Platform.

AttnGan

The model is from GitHub - taoxugit/AttnGAN.
A machine learning model that can generate image from a sentence.
A pre-trained model is provided in Runway.

About

NYU ITP 2019 Thesis. An interactive experience to see how machine interpret one thing differently from human

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published