Read the linked for a different perspective on "lines of code". Also Emily's writeup of an array languages symposium: http://coding-is-like-cooking.info/2013/09/an-introduction-to-array-languages/
Also look at this, a text editor written in a handful of lines: http://kparc.com/$/edit.k
Emily says it "looks like line noise". You know what else looks like line noise? Compressed text files. Compression crams as much information as possible in the smallest space, getting rid of all the redundancy. Text before compression is highly redundant, this helps understanding in particular by affording error correction. If I spell it "eror" you still know what I mean.
What's a program? A description of an abstract machine's behavior. Which is to say, a text, which entails structure and redundancy. You can compress the redundancy away, to different extents. That depends on the language.
If you want to evaluate how well a compressor performs, you can't just take the size of the output (compressed) text as the metric. You have to add the size of the decompressor. A smarter decompressor is bigger but can compress more efficiently. For instance, if your decompressor includes every word in the English dictionary you can represent a word with a single number, its index in the dictionary. (Plus a few bits for case and declension and so on.)
The goal of VPRI seems to be compression in that sense. The program above, "edit.k", is very small but the decompressor must have a lot of sophistication.
"Lines of Code" is useless as a metric because it totally fails to account for the redundancy/compression tradeoff.
I've seen several software systems which consisted largely of massive amounts of dead simple code generated from much smaller data files and a somewhat sophisticated code generation engine. That's one way to compress. The generated code lines don't "really" count towards the system's cognitive complexity. (Except... One of the worst ideas in programming, by the way, is to then check those code files into the version control system and start modifying them by hand. If you do that, you end up in a peculiar circle of Programming Hell.)
I've seen software systems where WAY too much of the behavior was specified by a set of XML files (Spring configuration, if you must know). These would normally not be counted as "lines of code", but in this case they should.
Why? Because these files account (on these projects anyway) for a large portion of the cognitive complexity. Making changes to anything usually involves searching through Java files to see where the behavior is, then searching for the names of particular classes in the XML to see what objects are wired to what other objects. Predicting the behavior becomes a juggling exercise.
This is poor compression, like using Run Length Coding on 32-bit photo files. Like RLC, I suppose XML files have their limited domain of applicability, but mostly I've seen them contribute "noise".
We all know that noise compresses poorly. In programs, the equivalent of noise is accidental complexity: say, using a Data Transfer Object to load values into an instance then passing the instance to a "procedural-written-in-OO" set of functions, then writing the result back to a DTO. (If you want a simpler example: assigning value X to variable A, but by way of several unnecessary intermediate steps involving other variables.)
When you count a system's LOC, you are most often going to count a large amount accidental complexity, "noise".
This is the correct intuition behind Function Points (flawed as they are), that you need independent ways of expressing the following three distinct concepts:
- how large a program NEEDS to be, "essential complexity"
- how large it HAPPENS to be, "contigent complexity",
- how hard it is to WORK with, "cognitive complexity".
Function Points have aimed at formulating a relation between the first two, and pinning down number one. Cyclomatic Complexity can be understood as a (not very successful) attempt at number three.