Actions

Algoliterary Encounters: Difference between revisions

From Algolit

(text-to-numbers / vectors)
 
(101 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
 +
== About ==
 +
* [[An Algoliterary Journey]]
 +
* [[Program]]
  
Start of the Algoliterary Encounters catalog.
+
==Algoliterary works==
 
+
A selection of works by members of Algolit presented in other contexts before.
==== General Introduction ====
 
 
 
* [[Introduction Algolit]]
 
* [[Introction on the Stanford course]]
 
 
 
==== Overview of Techniques ====
 
 
 
===== Rule-based agents =====
 
 
 
* [[introduction Rule-Based]]
 
* [[Oulipo Recipes]]
 
* [[Textmining is]]
 
* [[Rule-Based writing-system modality.py]]
 
 
 
===== Classic Machine Learning =====
 
 
 
* [[introduction Classic Machine Learning]]
 
 
* [[i-could-have-written-that]]
 
* [[i-could-have-written-that]]
* [[The Weekly Address]]
+
* [[The Weekly Address, A model for a politician]]
 +
* [[In the company of CluebotNG]]
 +
* [[Oulipo recipes]]
  
===== Neural Networks =====
+
==Algoliterary explorations==
 +
This chapter presents part of the research of Algolit over the past year.
  
* [[introduction Neural Networks]]
+
=== What the Machine Writes: a closer look at the output ===
* [[RCharNN]]
+
Two neural networks are presented more closely, what content do they produce?
 +
* [[CHARNN text generator]]
 +
* [[You shall know a word by the company it keeps]]
  
==== Elements of Neural Networks ====
+
=== How the Machine Reads: Dissecting Neural Networks ===
 +
==== Datasets ====
 +
Working with Neural Networks includes collecting big amounts of textual data.
 +
We compared a 'regular' size with the collection of words of the Library of St-Gilles.
 +
* [[Many many words]]
  
===== input (datasets) =====  
+
=====Public datasets=====
 +
Most commonly used public datasets are gathered at [https://aws.amazon.com/public-datasets/ Amazon].
 +
We looked closely at the following two:
 +
* [[Common Crawl]]
 +
* [[WikiHarass]]
  
* posters with literary works that are readable for machines and escape copyright
+
=====Algoliterary datasets=====
 +
Working with literary texts allows for poetic beauty in the reading/writing of the algorithms.
 +
This is a small collection used for experiments.
 +
* [[The data (e)speaks]]
 +
* [[Frankenstein]]
 +
* [[Learning from Deep Learning]]
 +
* [[nearbySaussure]]
 +
* [[astroBlackness]]
  
* public domain dataset
+
==== From words to numbers ====
 +
As machine learning is based on statistics and math, in order to process text, words need to be transformed to numbers. In the following section we present three technologies to do so.
 +
* [[A Bag of Words]]
 +
* [[A One Hot Vector]]
 +
* [[About Word embeddings|Exploring Multidimensional Landscapes: Word Embeddings]]
 +
* [[Crowd Embeddings|Word Embeddings Casestudy: Crowd embeddings]]
  
===== text-to-numbers / vectors =====
+
===== Different vizualisations of word embeddings =====
 +
* [[Word embedding Projector]]
 +
* [[The GloVe Reader]]
  
* posters of matrices
+
===== Inspecting the technique behind word embeddings =====
* bag-of-words: physical book vector exercise (with multiple books or with one book)
+
* [[word2vec_basic.py]]
* [[word embeddings]]
+
* [[Reverse Algebra]]
* [[one-hot-vector script]]
 
* [[word2vec_basic.py]] - inspected word2vec script
 
* [[talking_about_machine_learning]] - exploring the vocabulary of machine learning textbooks in 7 stages with word2vec
 
  
===== layers, nodes, weights =====
+
=== How a Machine Might Speak ===
 
+
If a computer model for language comprehension could speak, what would it say?
* backpropagation (linear algebra, influence of weight on network)
+
* [[We Are A Sentiment Thermometer]]
 
 
===== multidimensionality =====
 
 
 
* digital interactive visualisation & printed visualisation 1 poster for 1 dimension (total could be 30 posters)
 
 
 
===== algorithms/nodes =====
 
 
 
* softmax poster/booklet with comments on code & output
 
 
 
* visualisations
 
 
 
* playground.tensorflow.org javascript
 
   
 
===== output =====
 
 
 
==== Datasets ====
 
  
* Context of Neural Networks
+
== Sources ==
 +
The scripts we used and a selection of texts that kept us company.
 +
* [[Algoliterary Toolkit]]
 +
* [[Algoliterary Bibliography]]
  
* Frameworks & Existing Communities
 
  
* Algoliterary Works by Others
+
[[Category:Algoliterary-Encounters]]

Latest revision as of 13:50, 2 November 2017

About

Algoliterary works

A selection of works by members of Algolit presented in other contexts before.

Algoliterary explorations

This chapter presents part of the research of Algolit over the past year.

What the Machine Writes: a closer look at the output

Two neural networks are presented more closely, what content do they produce?

How the Machine Reads: Dissecting Neural Networks

Datasets

Working with Neural Networks includes collecting big amounts of textual data. We compared a 'regular' size with the collection of words of the Library of St-Gilles.

Public datasets

Most commonly used public datasets are gathered at Amazon. We looked closely at the following two:

Algoliterary datasets

Working with literary texts allows for poetic beauty in the reading/writing of the algorithms. This is a small collection used for experiments.

From words to numbers

As machine learning is based on statistics and math, in order to process text, words need to be transformed to numbers. In the following section we present three technologies to do so.

Different vizualisations of word embeddings
Inspecting the technique behind word embeddings

How a Machine Might Speak

If a computer model for language comprehension could speak, what would it say?

Sources

The scripts we used and a selection of texts that kept us company.