Profile cover photo
Profile photo
Denise Case
The secret of life is two words: not always so :)
The secret of life is two words: not always so :)

Denise's posts

Post has shared content
Linux kernel creator Linus Torvalds weighs in on git and SHA1.

#computer #science #code #SHA1 #git
I thought I'd write an update on git and SHA1, since the SHA1 collision attack was so prominently in the news.

Quick overview first, with more in-depth explanation below:

(1) First off - the sky isn't falling. There's a big difference between using a cryptographic hash for things like security signing, and using one for generating a "content identifier" for a content-addressable system like git.

(2) Secondly, the nature of this particular SHA1 attack means that it's actually pretty easy to mitigate against, and there's already been two sets of patches posted for that mitigation.

(3) And finally, there's actually a reasonably straightforward transition to some other hash that won't break the world - or even old git repositories.

Anyway, that's the high-level overview, you can stop there unless you are interested in some more details (keyword: "some". If you want more, you should participate in the git mailing list discussions - I'm posting this for the casual git users that might just want to see some random comments).

Anyway, on to the "details":

(1) What's the difference between using a hash for security vs using a hash for object identifiers in source control management?

Both want to use cryptographic hashes, but they want to use them for different reasons.

A hash that is used for security is basically a statement of trust: and if you can fool somebody, you can make them trust you when they really shouldn't. The point of a cryptographic hash there is to basically be the source of trust, so in many ways the hash is supposed to fundamentally protect against people you cannot trust other ways. When such a hash is broken, the whole point of the hash basically goes away.

In contrast, in a project like git, the hash isn't used for "trust". I don't pull on peoples trees because they have a hash of a4d442663580. Our trust is in people, and then we end up having lots of technology measures in place to secure the actual data.

The reason for using a cryptographic hash in a project like git is because it pretty much guarantees that there is no accidental clashes, and it's also a really really good error detection thing. Think of it like "parity on steroids": it's not able to correct for errors, but it's really really good at detecting corrupt data.

Other SCM's have used things like CRC's for error detection, although honestly the most common error handling method in most SCM's tends to be "tough luck, maybe your data is there, maybe it isn't, I don't care".

So in git, the hash is used for de-duplication and error detection, and the "cryptographic" nature is mainly because a cryptographic hash is really good at those things.

I say "mainly", because yes, in git we also end up using the SHA1 when we use "real" cryptography for signing the resulting trees, so the hash does end up being part of a certain chain of trust. So we do take advantage of some of the actual security features of a good cryptographic hash, and so breaking SHA1 does have real downsides for us.

Which gets us to ...

(2) Why is this particular attack fairly easy to mitigate against at least within the context of using SHA1 in git?

There's two parts to this one: one is simply that the attack is not a pre-image attack, but an identical-prefix collision attach. That, in turn, has two big effects on mitigation:

(a) the attacker can't just generate any random collision, but needs to be able to control and generate both the "good" (not really) and the "bad" object.

(b) you can actually detect the signs of the attack in both sides of the collision.

In particular, (a) means that it's really hard to hide the attack in data that is transparent. What do I mean by "transparent"? I mean that you actually see and react to all of the data, rather than having some "blob" of data that acts like a black box, and you only see the end results.

In the pdf examples, the pdf format acted as the "black box", and what you see is the printout which has only a very indirect relationship to the pdf encoding.

But if you use git for source control like in the kernel, the stuff you really care about is source code, which is very much a transparent medium. If somebody inserts random odd generated crud in the middle of your source code, you will absolutely notice.

Similarly, the git internal data structures are actually very transparent too, even if most users might not consider them so. There are places you could try to hide things in (in particular, things like commits that have a NUL character that ends printout in "git log"), but "git fsck" already warns about those kinds of shenanigans.

So fundamentally, if the data you primarily care about is that kind of transparent source code, the attack is pretty limited to begin with. You'll see the attack. It's not silently switching your data under from you.

"But I track pdf files in git, and I might not notice them being replaced under me?"

That's a very valid concern, and you'd want your SCM to help you even with that kind of opaque data where you might not see how people are doing odd things to it behind your back. Which is why the second part of mitigation is that (b): it's fairly trivial to detect the fingerprints of using this attack.

So we already have patches on the git mailing list which will detect when somebody has used this attack to bring down the cost of generating SHA1 collisions. They haven't been merged yet, but the good thing about those mitigation measures is that not everybody needs to even run them: if you host your project on something like or, it's already sufficient if the hosting place runs the checks every once in a while - you'll get notified if somebody poisoned your well.

And finally, the "yes, git will eventually transition away from SHA1". There's a plan, it doesn't look all that nasty, and you don't even have to convert your repository. There's a lot of details to this, and it will take time, but because of the issues above, it's not like this is a critical "it has to happen now thing".

Post has shared content
Amazing work by SpaceX. :)

#space #exploration #boldlygo
Here is a very clear GIF of the recent Space X landing. 
Animated Photo

Post has shared content
President Trump wants your input. Take a moment to share your opinion regarding the media.

#media #press #facts #propaganda
I just learned about this survey being taken by Trump.
I answered it, not the way he would like to hear. They ask for your name and email at the end, and I was happy to supply mine, but do be warned that you need to do so.

Post has attachment
Russian hackers penetrated U.S. electricity grid through a utility in Vermont

Post has attachment
Interesting times. We should be investing in the birth of a new scientific Renaissance - our human potential is collectively greater than ever. Instead, we seem to cheer the expression of the more base aspects of our humanity: selfishness, greed, vanity. If we lose sight of the principles and values that form the foundation of the republic and if we fail to uphold the desire and resolve to practice them, then our experiment may have been heartbreakingly brief.

How Republics End

Post has shared content
The use of Python as a data science tool has been on the rise over the past few years: 54% of the respondents of the latest O'Reilly Data Science Salary Survey indicated that they used Python. The results of the 2015 survey indicated that 51% of the respondents used Python.

Post has shared content
Last Wednesday I attended the first meeting of the Data-Science Education Roundtable. This is a group initially put together by the Statistics branch of the National Research Council, but later joined by the Computer-Science branch and also ACM, so there was a bit of balance in the viewpoints.

The first meeting was devoted to getting the lay of the land, and there were presentations from four viewpoints: Stat. CS, Engineering, and Math, as well as presentations from various potential employers of the students to be educated. The entire proceedings will eventually be posted at At the moment, it is just a skeleton, however.

Interestingly, there was little disagreement about the components of DS degrees, which I might summarize as Algorithms + Statistics + Machine Learning + Cloud Computing (i.e., MapReduce and the things that followed). A surprise that there seemed to be many participants who viewed a PhD program in DS as premature. I disagree, and surely at Stanford and many other places I know about, students are gaining PhD's with theses that are squarely in the DS area.

Post has shared content
The writing is on the wall

This is the main building of the Environmental Protection Agency or EPA, right across from Trump Hotel in Washington DC.   A lot of us are protesting the guy Trump hired to demolish the EPA.  His name is  Myron Ebell.  He's a climate change denier whose work has long been funded by fossil fuel industries.

Join the battle:

George Monbiot provides more detail:

Yes, Donald Trump’s politics are incoherent. But those who surround him know just what they want, and his lack of clarity enhances their power. To understand what is coming, we need to understand who they are. I know all too well, because I have spent the past 15 years fighting them.

Over this time, I have watched as tobacco, coal, oil, chemicals and biotech companies have poured billions of dollars into an international misinformation machine composed of thinktanks, bloggers and fake citizens’ groups. Its purpose is to portray the interests of billionaires as the interests of the common people, to wage war against trade unions and beat down attempts to regulate business and tax the very rich. Now the people who helped run this machine are shaping the government.

I first encountered the machine when writing about climate change. The fury and loathing directed at climate scientists and campaigners seemed incomprehensible until I realised they were fake: the hatred had been paid for. The bloggers and institutes whipping up this anger were funded by oil and coal companies.

Among those I clashed with was Myron Ebell of the Competitive Enterprise Institute (CEI). The CEI calls itself a thinktank, but looks to me like a corporate lobbying group. It is not transparent about its funding, but we now know it has received $2m from ExxonMobil, more than $4m from a group called the Donors Trust (which represents various corporations and billionaires), $800,000 from groups set up by the tycoons Charles and David Koch, and substantial sums from coal, tobacco and pharmaceutical companies.

For years, Ebell and the CEI have attacked efforts to limit climate change, through lobbying, lawsuits and campaigns. An advertisement released by the institute had the punchline “Carbon dioxide: they call it pollution. We call it life.”

It has sought to eliminate funding for environmental education, lobbied against the Endangered Species Act, harried climate scientists and campaigned in favour of mountaintop removal by coal companies. In 2004, Ebell sent a memo to one of George W Bush’s staffers calling for the head of the Environmental Protection Agency to be sacked. Where is Ebell now? Oh – leading Trump’s transition team for the Environmental Protection Agency.

It's not just Ebell: Trump is hiring lots of creatures from the swamp of fake industry-funded "research institutes".  For details, and links providing evidence, go here:

Post has shared content
The loser won

Having lost the popular vote, Trump is busy deleting tweets from 2012 in which he falsely claimed that Obama did the same - and argued that therefore "We should have a revolution in this country!"

Post has shared content
American leadership. Climate change is becoming expensive in the short-term as well as the long-term. Fiscal conservatives around the world should be supporting a solid plan for working together to address this issue. American needs to lead the effort, not run from it.

#climatechange #usEPA #sciencematters
This man must be stopped

Trump has said on Twitter that:

the concept of global warming was created by and for the Chinese in order to make US manufacturing non-competitive.

While he later denied saying this, he is now threatening to put Myron Ebell in charge of his Environmental Protection Agency "transition team".  Transition team?  Yes, apparently Trump wants to weaken or destroy this agency.  And if you don't know Myron Ebell, you'd better learn about him now!

Myron Ebell has said:

I don’t want to say it’s a disaster, but I think it is potentially a disaster for humankind and not necessarily any good for the planet.

What's he talking about?  Global warming?  No, he's talking about the Paris agreement to fight global warming.  He claims global warming is, on the whole, a good thing.  Why?

In fact, there is no question that most people prefer less severe winters.

After running an organization devoted to eliminating protection for endangered species, he switched to heading the Global Warming and International Environmental Policy project at an institute funded by Exxon.  His job was to sow doubt and create confusion about climate change.

But he burst into fame in 2002.  That's when he helped Bush's "council on environmental policy" water down a key report on global warming.  He was caught by Greenpeace, and a scandal erupted.  

He also tried to get the head of the Environmental Protection Agency fired.  Back then it was Christine Todd Whitman.   In a secret memo to Philip Cooney, head of Bush's anti-environmental council, Ebell wrote:

It seems to me that the folks at the EPA are the obvious fall guys, and we would only hope that the fall guy (or gal) should be as high up as possible. I have done several interviews and have stressed that the president needs to get everyone rowing in the same direction. Perhaps tomorrow we will call for [Christine Todd Whitman] to be fired. I know that that doesn't sound like much help, but it seems to me that our only leverage to push you in the right direction is to drive a wedge between the President and those in the Administration who think they are serving the president's best interests by publishing this rubbish.

"This rubbish" was a report put out by the EPA warning people of the dangers of climate change.

So, get ready: this guy will be working full-time to cause trouble!

Here's a good Scientific American article to get you up to speed:

Here's Myron Ebell on Wikipedia:

Here's Myron Ebell rewriting scientific reports:

Myron Ebell saying global warming is, on the whole, a good thing:

Here's Trump's claim that climate change is a notion invented by the Chinese:

Wait while more posts are being loaded