Shawn O'Hare
Mathematician and Data Scientist
Mathematician and Data Scientist
Posts
Post has attachment
Post has attachment
Post has attachment
Categorical Limits and Colimits
Many types of universal constructions that appear in a wide
variety of mathematical contexts can be realized
as categorical limits and colimits. To define a limit
we first need the notion of a cone of a diagram.
A diagram of type $J$ in a category $\math... Post has attachment TF-IDF for Document Clustering In this post we discuss a standard way to encode a text document as a vector using a term frequency-inverse document frequency (tf-idf) score for each word, with an aim to cluster similar documents in a corpus. Suppose we have a corpus of text documents$\...
Post has attachment
The Continuum Hypothesis and Weird Probabilities
I recently heard of an interesting proof'' that $(0,1)$ does not
have cardinality $\aleph_1$. This would disprove the Continuum Hypothesis
(\textbf{CH}), which asserts that any subset of $(0,1)$ is either countable or
has the same cardinality as $(0,1)$...
Post has attachment
Simple Divisibility Tests
In this post we will justify some common divisibility tests that most
school children are familiar with. Recall that a number such as
$111$ is divisible by $3$ if and only if the sum of the digits is
divisible by $3$. Since $1+1+1=3$ is divisible by $3$,...
Post has attachment
A Simple Proof that the Harnoic Series Diverges
One usually encounters the harmonic series
$\sum_{k=1}^{\infty} \frac{1}{k}$
as an example of a series that diverges for non-obvious reasons.
With $H_n$ defined to be the partial sum $H_n:=\sum_{k=1}^n \frac{1}{k}$,
a typical way to prove divergence i...
Post has attachment
Flickr Tags are Useless
Recently I began to code up an image classifier, and my hope was to use Flickr images as a source of real data.  The Flickr API allows you to easily obtain the most recent 100 images that have a given tag.  However, I found that the correspondence between t...
Post has attachment
Sample Covariance Matrix
Suppose $X_1, \dots, X_p$ are random variables and $X:=(X_1, \dots, X_p)$ is the random vector
of said random variables. Given a sample $\mathbf x_1, \dots \mathbf x_N$ of $X$, how do we obtain an estimate
for the covariance matrix $\text{Cov}(X)$? To this...
Post has attachment
US Life Expectancy Data
I took a look at the World Health Organization's data on the at birth life expectancies for various countries in the world.  The number given for each year assumes that levels of mortality will remain roughly the same throughout the individual's life, and i...