### Brydon Parker

Discussion -Joe is a software engineer living in lower manhattan that specializes in machine learning, statistics, python, and computer vision.

1

Impressive!

Add a comment...

Start a hangout

All communitiesRecommended for you

Join this community to post or comment

Join community

Joe is a software engineer living in lower manhattan that specializes in machine learning, statistics, python, and computer vision.

1

Impressive!

Add a comment...

A while back I wrote about how the classical non-parametric bootstrap can be seen as a special case of the Bayesian bootstrap. Well, one difference …

5

1

Add a comment...

moderator

"The problem posed is the classic one, along these lines [...]: given that a biased die averaged 4.5 on a large number of tosses, assign probabilities for the next toss, x. This problem can seemingly be solved by Bayesian Inference, or by MaxEnt with a constraint on the expected value of x: E(x) =4.5. These two approaches give different answers!"

https://letterstonature.wordpress.com/2008/12/29/where-do-i-stand-on-maximum-entropy/

https://letterstonature.wordpress.com/2008/12/29/where-do-i-stand-on-maximum-entropy/

My title is taken from a similarly titled article by the physicist Ed Jaynes, whose work influenced me greatly. It refers to a controversial idea of epistemological probability theory: the method o...

2

Thanks, a good read.

Add a comment...

moderator

"The prior distribution p(theta) in a Bayesian analysis is often presented as a researcher’s beliefs about theta. I prefer to think of p(theta) as an expression of information about theta."

http://andrewgelman.com/2015/07/15/prior-information-not-prior-belief/

http://andrewgelman.com/2015/07/15/prior-information-not-prior-belief/

The prior distribution p(theta) in a Bayesian analysis is often presented as a researcher’s beliefs about theta. I prefer to think of p(theta) as an expression of information about theta. Consider this sort of question that a classically-trained statistician asked me the other day: If two Bayesians are given the same data, they will come …

6

2

I'd say that it should be no bother that the model is different.

It's frequently the case in machine learning that the model builder's skill and experience has a large effect on the quality.

It's frequently the case in machine learning that the model builder's skill and experience has a large effect on the quality.

Add a comment...

moderator

"This is the Bayesian approach. You have a belief according to existing evidence and theories. If a new bit of evidence comes in you don’t discard all prior knowledge, or pretend that we currently know nothing. You simply update your belief, adding the new information to existing information. In this way our beliefs slowly evolve, tracking with new evidence and ideas (unless you have a large emotional investment in one belief, but that’s another post)."

http://theness.com/neurologicablog/index.php/in-defense-of-prior-probability/

http://theness.com/neurologicablog/index.php/in-defense-of-prior-probability/

This post is a follow up to one from last week about reproducibility in science. An e-mailer had a problem with the following statement: 'I tend to accept...

8

1

Add a comment...

This is a writing-problem concerning a bayesian analysis I hope to publish. There is a simple idea that I just can't justify succinctly. People must have to deal with it all the time but I can't find any references. It's driving me nuts!

The concept I want to express: As we wish to consider a larger range of data-values, a model must be made more complicated in order to remain useful.

As an example: If I drop a rock a distance of one-metre, I can probably get away with a constant-acceleration model. If I drop a rock a distance of a kilometre, I have to consider air resistance. If I drop it a distance of 1000 kilometres, I must consider orbital dynamics. ...

Correspondingly,one way of managing the need for model-complexity in a bayesian model is to limit the range of data-values. In my particular circumstances, I can do that at an acceptable cost.

There must be a name for this concept. It's gotta be published somewhere. Google is failing me. The paper will lose a lot of focus if I have to chase this tangent. Can anybody suggest a useful reference? Or even a useful term to google?

The concept I want to express: As we wish to consider a larger range of data-values, a model must be made more complicated in order to remain useful.

As an example: If I drop a rock a distance of one-metre, I can probably get away with a constant-acceleration model. If I drop a rock a distance of a kilometre, I have to consider air resistance. If I drop it a distance of 1000 kilometres, I must consider orbital dynamics. ...

Correspondingly,one way of managing the need for model-complexity in a bayesian model is to limit the range of data-values. In my particular circumstances, I can do that at an acceptable cost.

There must be a name for this concept. It's gotta be published somewhere. Google is failing me. The paper will lose a lot of focus if I have to chase this tangent. Can anybody suggest a useful reference? Or even a useful term to google?

1

5 comments

Dan Mazur

+

1

2

1

2

1

The concept is just that of approximation.

I would say "Within the range <describe range>, the model can be approximated by a simplified version where <list parameters> are ignored."

I would say "Within the range <describe range>, the model can be approximated by a simplified version where <list parameters> are ignored."

Add a comment...

In my PhD thesis on a historical language change from Latin to Old French (probably very boring subject for many people), and I am trying to use Bayesian inference for my data which is mainly categorical (with logistic regression glm as a prior). Surprisingly, I have never seen any previous linguistic study that has used Bayesian statistics (except language evolution prediction model which is different). I am not sure even how to present and explain my choices of prior and my models to a non-statistical non-Bayesian audience. I would greatly appreciate your insights!

1

8 comments

Thank you, Mad!

Add a comment...

help me solve

"in research of 1000 people 8% tested to have tuberculosis.the 1000 people then given new test found that tuberculosis was in 96% of those who have it and 2% for those who dont have.whats the probability of randomly chosen perso

1

Add a comment...

As a part of a bit complicated high school conditional probability question, I devised the following argument to get my answer to agree with the given answer. Please tell whether this argument is valid, by providing links for any theorems or so.

Below,

Three genes M, N, O that are responsible for the color of eyes occur randomly among adults and one person can have only one of these genes. Among children, probabilities of having Brown or Black eyes given that Parents are combination of MM, MN, MO...etc are given separately.

P(Ci) = Probabilities of parents having random genes M,N,O joining to produce a baby.

i.e, C1=MM C2=MN C3= MO C4= NM C5=NN C6=NO.... C9=OO

P(A) = Probability of Both parents having Black eyes

P(B) = Probability of child having brown eyes

P(A∩B)

= ⅀ P([A∩B] | Ci) P(Ci)

= ⅀ P(A | Ci) P(B | Ci) P(Ci)

= ⅀ P(A | Ci) (P(B∩Ci)

Note, in the second step, I assumed

P([A∩B] | Ci) = P(A | Ci) P(B | Ci)

because, A and B are both events that depend on C only. So, I take it that when given C has occurred (when the sample space is restricted to C only), Events A|C and B|C can be considered independent of each other.

Is this argument correct? Please support your answer with links or references to any theorems.

1

Add a comment...

Top 11 Free Software for Text Analysis, Text Mining, Text Analytics

KH Coder, Carrot2, GATE, tm, Gensim, Natural Language Toolkit, RapidMiner, Unstructured Information Management Architecture, OpenNLP, KNIME, Orange-Textable and LPU

Read more: http://wp.me/p43LB9-o

KH Coder, Carrot2, GATE, tm, Gensim, Natural Language Toolkit, RapidMiner, Unstructured Information Management Architecture, OpenNLP, KNIME, Orange-Textable and LPU

Read more: http://wp.me/p43LB9-o

Review of Top 11 Free Software for Text Analysis, Text Mining, Text Analytics ? KH Coder, Carrot2, GATE, tm, Gensim, Natural Language Toolkit, RapidMiner, Unstructured Information Management Architecture, OpenNLP, KNIME, Orange-Textable and LPU are some of the key vendors who provides text analytics software

1

Add a comment...

The non-parametric bootstrap was my first love. I was lost in a muddy swamp of zs, ts and ps when I first saw her. Conceptually beautiful, simple to …

8

4

Check out David Draper's talk for an interpretation of the bootstrap as a Dirichlet process:

https://users.soe.ucsc.edu/~draper/draper-irvine-15-may-2014.pdf

https://users.soe.ucsc.edu/~draper/draper-irvine-15-may-2014.pdf

Add a comment...

moderator

"I don’t think statistical models are representations of the data at all (barring one exception, which I will discuss later). Instead, they are representations of the *prior information* that our analysis is assuming"

https://plausibilitytheory.wordpress.com/2015/07/10/what-is-a-statistical-model/

https://plausibilitytheory.wordpress.com/2015/07/10/what-is-a-statistical-model/

What is a statistical model? This question was posed recently by the excellent "Stats Fact" Twitter account, which linked to a paper that was too complicated for me to understand, involving categor...

1

2

Add a comment...

Nice high-level talk on the difference between Frequentism and Bayesianism

https://clip.mn/video/yt-KhAUfqhLakw

https://clip.mn/video/yt-KhAUfqhLakw

14

8

Add a comment...

Pre-Bayesian: Ridiculous, probabilities are

without doubt objective. They can be seen

in the relative frequencies they cause.

Bayesian: So if p = 0.75 for some event, after

1000 trials we’ll see exactly 750 such events?

Pre-Bayesian: You might, but most likely you

won’t see that exactly. You’re just likely to

see something close to it.

Bayesian: Likely? Close? How do you define or

quantify these things without making reference

to your degrees of belief for what will

happen?

Pre-Bayesian: Well, in any case, in the infinite

limit the correct frequency will definitely

occur.

Bayesian: How would I know? Are you saying

that in one billion trials I could not possibly

see an “incorrect” frequency? In one

trillion?

Pre-Bayesian: OK, you can in principle see

an incorrect frequency, but it’d be ever less

likely!

Bayesian: Tell me once again, what does ‘likely’

mean?

without doubt objective. They can be seen

in the relative frequencies they cause.

Bayesian: So if p = 0.75 for some event, after

1000 trials we’ll see exactly 750 such events?

Pre-Bayesian: You might, but most likely you

won’t see that exactly. You’re just likely to

see something close to it.

Bayesian: Likely? Close? How do you define or

quantify these things without making reference

to your degrees of belief for what will

happen?

Pre-Bayesian: Well, in any case, in the infinite

limit the correct frequency will definitely

occur.

Bayesian: How would I know? Are you saying

that in one billion trials I could not possibly

see an “incorrect” frequency? In one

trillion?

Pre-Bayesian: OK, you can in principle see

an incorrect frequency, but it’d be ever less

likely!

Bayesian: Tell me once again, what does ‘likely’

mean?

4

1

2 comments

+charles griffiths But sometimes a decision needs to be made, whether or not we have "answers about the actual world". Losses are least using the Bayesian approach.

Add a comment...

moderator

This is perhaps the first real crack in the wall for the almost-universal use of the null hypothesis significance testing procedure (NHSTP). The journal, Basic...

15

8

2 comments

I guess they took the phrase "lies, damn lies, statistics" a bit literally.. by the way this is not a "crack" sort of thing.. I mean, Cumming's “dance of the p-value” argument is valid but do we have a better alternative keeping it's simplicity?

Add a comment...

Good evening all

I have encountered a counter-intuitive result while thinking about Bayesian networks and decided to ask the members of this group

Suppose A is the probability space of all possible events (with P(A)=1, of course)

Now suppose A is partitioned into A1 and A2 such that P(A1)=P(A2)=0.5 and let a be some event in A

According to Bayes' Rule,

P(A1|a) +P (A2|a) = P(A1)/P(a) *P(a|A1) + P(A2)/P(a)*P(a|A2) =

= 0.5/P(a) *(P(a|A1) + P(a|A2)) = 0.5 since P(A1)= P(A2) = 0.5

P(A1+A2)=1 and A1 and A2 are disjoint, yet P(A1+A2|a) ~= P(A1|a) + P(A2|a) = 0.5

But A1 and A2 are disjoint and P(A1) + P(A2) = 1 so a must be fully contained in the union of A1 and A2 since it is contained in the universal probability space A.

Likewise, P(a|A1) + P(a|A2) = (P(a)/0.5) * (P(A1|a) +P (A2|a))

My head is spinning from this. Is there a rationalization I don''t know about?

I have encountered a counter-intuitive result while thinking about Bayesian networks and decided to ask the members of this group

Suppose A is the probability space of all possible events (with P(A)=1, of course)

Now suppose A is partitioned into A1 and A2 such that P(A1)=P(A2)=0.5 and let a be some event in A

According to Bayes' Rule,

P(A1|a) +P (A2|a) = P(A1)/P(a) *P(a|A1) + P(A2)/P(a)*P(a|A2) =

= 0.5/P(a) *(P(a|A1) + P(a|A2)) = 0.5 since P(A1)= P(A2) = 0.5

P(A1+A2)=1 and A1 and A2 are disjoint, yet P(A1+A2|a) ~= P(A1|a) + P(A2|a) = 0.5

But A1 and A2 are disjoint and P(A1) + P(A2) = 1 so a must be fully contained in the union of A1 and A2 since it is contained in the universal probability space A.

Likewise, P(a|A1) + P(a|A2) = (P(a)/0.5) * (P(A1|a) +P (A2|a))

My head is spinning from this. Is there a rationalization I don''t know about?

1

Add a comment...

Could someone point me to some literature on setting priors?

Specifically, I want to set a prior for Click Through Rate estimation but I want to penalize a subset of the results based on the cardinality of the set, but It doesn't sound like a very Bayesian thing to do. Naturally, some reading could help. Thanks!

Specifically, I want to set a prior for Click Through Rate estimation but I want to penalize a subset of the results based on the cardinality of the set, but It doesn't sound like a very Bayesian thing to do. Naturally, some reading could help. Thanks!

1

4 comments

Splendid, thank you. I will go through this!

Add a comment...

I have been asked to improve the way that predictions of reliability of my companies products are done. I think I understand Bayes' Theorem, but putting into practice is another thing. I have the number of hours that the sub system has before it fails, if we have a corrective action for the failure mode (I don't have a lot of confidence of the root causes or effectiveness), the total number of machines built in a quarter, the percent new content for the next generation of machine. So can I predict what the reliability will be at the start of production for the new product and a year into production? Can I use excel, R, mini tab? I use a mac by the way.

1

since none else answer you, I would do it: sure you can; and you can do it in excel and specially in R. You have to use survival analysis and use number of machines, etc as covariates. See for example:

http://www.hindawi.com/journals/mpe/2012/329489/

or

http://www.springer.com/statistics/physical+%26+information+science/book/978-0-387-77948-5

http://www.hindawi.com/journals/mpe/2012/329489/

or

http://www.springer.com/statistics/physical+%26+information+science/book/978-0-387-77948-5

Add a comment...