Algoliterary Encounters: Difference between revisions
From Algolit
(→Datasets) |
(→Datasets) |
||
Line 22: | Line 22: | ||
==== Datasets ==== | ==== Datasets ==== | ||
− | * [[The | + | * [[Many many words]] |
+ | |||
+ | * [[The Enron email archive]] | ||
+ | * [[Common Crawl]] (used by GloVe): selection of urls (Constant, Maison du Livre...) | ||
+ | * [[Google News]] (used by word2vec) | ||
+ | * [[Learning from Deep Learning]] (from lib.gen.rus.ec) (.txt) | ||
+ | * [[HG Wells personal dataset]] (from Gutenberg.org) (.txt) | ||
+ | * Jules Verne (FR), Shakespeare (FR) -> download from Gutenberg & clean up | ||
+ | * [[AnarchFem]] (from aaaaarg.fail) (.txt) | ||
==== From words to numbers ==== | ==== From words to numbers ==== |
Revision as of 10:54, 24 October 2017
Start of the Algoliterary Encounters catalog.
General Introduction
Algoliterary works
- Oulipo scripts
- i-could-have-written-that
- Obama, model for a politician
- ClueBotNG, a special Algolit edition
Algoliterary explorations
A few outputs to see how it works
- CHARNN text generator
- human & view & power in 5 landscapes - Five word2vec graphs, each of them containing the words 'human', 'view' and 'power'.
(Before: talking_about_machine_learning - exploring the vocabulary of machine learning textbooks in 7 stages with word2vec)
Parts of NN process
Datasets
- The Enron email archive
- Common Crawl (used by GloVe): selection of urls (Constant, Maison du Livre...)
- Google News (used by word2vec)
- Learning from Deep Learning (from lib.gen.rus.ec) (.txt)
- HG Wells personal dataset (from Gutenberg.org) (.txt)
- Jules Verne (FR), Shakespeare (FR) -> download from Gutenberg & clean up
- AnarchFem (from aaaaarg.fail) (.txt)
From words to numbers
Different views on the data
Creating word embeddings using word2vec
- word2vec applications - this can serve as an introduction to word2vec?
- word2vec_basic.py - in piles of paper
- softmax annotated
- chatbot for word mathematics
Autonomous machine as inspection
Algoliterary Toolkit
- cgi interface template
- text-punctuation-clean-up.py
Bibliography
- Algoliterary Bibliography - Reading Room texts