Profile cover photo
Profile photo
INPIRICAL
9 followers -
Intelligent empirical analysis.
Intelligent empirical analysis.

9 followers
About
Posts

Post has shared content
If interested in applying R to financial analysis, here's a discussion of a momentum trading R codebase with visualisation.
Code has been released under the MIT license.

Post has attachment

Post has shared content
Commented R code for compiling a distant supervision sentiment lexicon with >40k entries for use in training sentiment classifiers..
Compiling a sentiment lexicon of 44k words using a 1.6m item database of tweets with emoticons stripped. The full R code has been released under the MIT license.

Entries are scored within a -5 to +5 range based on pointwise mutual information with positive and negatie emoticons.

https://www.linkedin.com/pulse/article/20141205003839-34768479-1-6m-emoticon-stripped-tweets-44k-entry-sentiment-lexicon

Post has shared content
Natural language processing in R of a large database of tweets with emoticons stripped and used for "distant supervision" machine learning..
Compiling a sentiment lexicon of 44k words using a 1.6m item database of tweets with emoticons stripped. The full R code has been released under the MIT license.

Entries are scored within a -5 to +5 range based on pointwise mutual information with positive and negatie emoticons.

https://www.linkedin.com/pulse/article/20141205003839-34768479-1-6m-emoticon-stripped-tweets-44k-entry-sentiment-lexicon

Post has shared content
Compiling a sentiment lexicon of 44k words using a 1.6m item database of tweets with emoticons stripped. The full R code has been released under the MIT license.

Entries are scored within a -5 to +5 range based on pointwise mutual information with positive and negatie emoticons.

https://www.linkedin.com/pulse/article/20141205003839-34768479-1-6m-emoticon-stripped-tweets-44k-entry-sentiment-lexicon

Post has shared content
Here is the full commented R code for algorithmically complining a sentiment lexicon from a database of tweets and an explanation of method and results.
RWeka package is used for unigram-tokenization...
Compiling a sentiment lexicon of 44k words using a 1.6m item database of tweets with emoticons stripped. The full R code has been released under the MIT license.

Entries are scored within a -5 to +5 range based on pointwise mutual information with positive and negatie emoticons.

https://www.linkedin.com/pulse/article/20141205003839-34768479-1-6m-emoticon-stripped-tweets-44k-entry-sentiment-lexicon

Post has attachment
Compiling a sentiment lexicon of 44k words using a 1.6m item database of tweets with emoticons stripped. The full R code has been released under the MIT license.

Entries are scored within a -5 to +5 range based on pointwise mutual information with positive and negatie emoticons.

https://www.linkedin.com/pulse/article/20141205003839-34768479-1-6m-emoticon-stripped-tweets-44k-entry-sentiment-lexicon
Add a comment...

Post has shared content

Post has shared content
Here's a detailed example with R code of benchmarking three tweet-sentiment classifiers in terms for precision, recall & f-measure..

The code has been released as a gist on GitHub, and we also look at interpreting confusion matrices.

Post has attachment
Wait while more posts are being loaded