Shared publicly  - 
 
I'm combining elements from two very different projects I started..

In +Subsurface, I originally used XML as the save file format, because while it is probably the worst format ever designed, it's what other projects used, and subsurface wasn't originally really capable enough to stand on its own.

Now, I did the best that I could with XML, and I suspect the subsurface XML is about as pretty and human-readable as you can make that crap, but it really doesn't scale as a file format, and it's generally a complete disaster. 

So me and +Dirk Hohndel (who maintains subsurface, I long since gave up that role) have been idly talking about better save formats for months now. But binary formats are evil and generally not extensible, and besides, you want replication and network transparency, plus the ability to combine dives from many different sources. Binary blobs are just horrible for all of these things.

So I've wanted to use the +Git object database format, because it's actually very well designed if I say so myself. Not only does efficient deduplication and compression, it has the advantage that it still does a really good job at line-based textual representations, while allowing a very natural representation of multiple different events as separate files, with git itself tying it all together.

And you get backups and history for free, plus a lot of tools to look at it all, seeing readable diffs for format changes or just new dives etc. In fact, both I and Dirk ended up already using git to track our XML files, because of these issues, but it wasn't very nicely integrated.

So I've been thinking about this for basically months, but the way I work, I actually want to have a good mental picture of what I'm doing before I start prototyping. And while I had a high-level notion of what I wanted, I didn't have enough of a idea of the details to really start coding.

Until a couple of days ago, when everything came together.

So now I'm happily hacking on a new save format using "libgit2", and apart from the git_treebuilder interface being horrible I think I'm making good progress. 
1529
230
Sameer Gupta's profile photoBrad Johnson's profile photoTobias Påhlsson's profile photoHasan Mahmud's profile photo
222 comments
 
one good thing you can say about XML is that it's open and widely supported.  that may also be the only good thing you can say about it.
 
XML is pretty good for one thing: document markup. That's it. It shouldn't be used for anything else if there's an alternative.
 
Just about anything is better than XML. JSON. RFC822 headers. INI files. Lisp s-expressions. Tcl "array put" format. sqlite3 files. Berkeley DB files.
 
+Mark Martin  - Yeah. Plus, it takes forever to parse compared to other formats. Thats why AJAX calls will use JSON most of the time, as opposed to XML (which is the X in AJAX). 
Jott
+
6
7
6
 
Ini files are not too bad or simply key=value\n
 
At least you have the leverage to push another format.
 
Json is pretty great as it's still very human readable. Is the git for format human readable?
 
What's wrong with yaml or coffee script json for configs? Or binlogs like mysql makes with a mysqlbinlog reader
 
Not sure if you've heard of eet. It's part of the EFL libs and, while it's binary, it has a nice decompiler and I've heard you can do some nice things. Maybe integrating that with your git magic can make it more awesome
 
+Linus Torvalds The company I work at are using libgit2 to store the main database for one of our products. The data consists of a tree of objects and changes are made in job transactions.

We previously used sql, but using git works really well. Getting the history, diffs and replication for free is just great. 
 
XML - nuke it from orbit, it is the only way to be sure.
 
+Aaron Traas no, XML isn't even good for document markup.

Use 'asciidoc' for document markup. Really. It's actually readable by humans, and easier to parse and way more flexible than XML.

XML is crap. Really. There are no excuses. XML is nasty to parse for humans, and it's a disaster to parse even for computers. There's just no reason for that horrible crap to exist.

As to JSON, it's certainly a better format than XML both for humans and computers, but it ends up sharing a lot of the same issues in the end: putting everything in one file is just not a good idea. There's a reason people end up using simple databases for a lot of things.

INI files are fine for simple config stuff. I still think that "git config" is a good implementation. 
 
+Mathieu Duponchelle it's not really in a sharable form yet. I have the object tree writer done, but right now it's more of a "this is the skeleton of what we will write" than anything full-bodied.

I think I'll send out a first concept patch to the subsurface list later today. Possibly tomorrow.  But that will be a write-only thing, it won't be able to read it back yet (but you can see what was written using git tools).

Reading the data back will be the next stage.
 
+Tomas Skäre sounds interesting. I might have to delve into this a bit... see if I can find any use for it myself. 
 
Did some yaml stuff recently and liked it. But git rocks too so I'm sure this will be great.
 
Wow this sounds amazing. Can't wait to hear more... Desrialising a changed object has always been a mess... This might solve it
 
The Linux way is to make things that do one thing and do it well. XML tries to do everything, and ends up doing them very poorly. XML must die.
 
+Patrick Bulteel If nothing else, it was a great way to learn the inner workings of git, since you really have to do much more manually, that the git binary just takes care of magically.
 
JSON and YAML are good for structured data. Asciidoc is good if it's mostly text. SQLite and CSV is good for tables. INI is good if it is single-value.
 
So, if we teach you to fly gliders, +Linus Torvalds, will you set out to write the best possible gliding computer, too? 
 
+Linus Torvalds binary formats are not evil if you know how they work. They can even be extensible.
XML or something like this is also stored in a binary formate - I hope you know. This is called ASCII or UTF-8 or such. But in general XML is the better "Hexdump" as many people design it. The ASCII-ness of the data does not make it easier to understand why the data is stored in this or that way.
Well, most people think it's easier to have ASCII-data...
 
XML isn't so bad. XSLT and especially Xpath are. 
rk fg
+
5
6
5
 
Actually, XML isn't that bad because it should be used together with XSL. That way it's not just another data format but also a simple database (using xpath). It can contain documentation on data types and meaning. Yeah, it's verbose but that's not always a bad thing.

Edit: Almost everything was said before while I was writing this -_-
 
I don't think that XML should be avoided like the plague, but yeah, there are alternatives that are more readable and more straightforward to parse. XML is still with us because it entered the stage early.
 
For anyone who, like me, is completely unfamiliar with Git's object data storage, there is documentation at http://git-scm.com/book/en/Git-Internals-Git-Objects   Unfortunately it is not very clear on how fields within objects are stored.

I've got no interest in diving, but I am curious to see what +Linus Torvalds can do for a better data storage format than XML or JSON. 

+Helmar Wodtke Binary formats are useful if you are bandwidth or size constrained, though even then a gzipped text format is probably better.

Computers in the 21st century are powerful enough to deal with the slight overhead of text based formats, and having a format you can view and munge with a text editor and the standard unix text processing tool chain is a huge advantage. I can't think of anything besides sound, video, and images that I would want to store in binary nowadays. Just about everything else should be text, in some sort of self-labeling, extensible format.  

(Note: All text should be stored in UTF-8 as well, ASCII should only be used for backwards compatibility with 20th century software.)
 
I thought the jabber config was as bad as things could get. Then I met ActiveMQ, and the very worst... hadoop-*. XML is what you do to a sysadmin if waterboarding him would get you fired.
 
+Kevin C. Well, text formates do have a big "pro". But that was not really my point. You have a counter-point by your assumption that 21th century computers can do that all. This thinking is "old style" and in my opinion a little bit "bad". The slower computers will be everywhere. You'll have the slower computer inside an energy saving light... Or inside your shoes - I do not exactly know what it can do there, but it will be there.
Be careful with assumptions what is needed. You could make a quote that is like XYZ kilobytes are enough.
The smaller and the smarter you can make algorithms, the better for usage.
If you can program a light-bulb with your algorithm, it's better than to have a big iron that has to make calculations...
 
God bless genius! Long live Linus Torvalds!!!
 
Thanks for reminding me to promote Git, that shit works fantastically! I've tried Mercurial and Subversion and several others, but I really do like Git.  #git  
 
At least XML is better than HTML, in that it actually can be parsed at all without knowing every little bit of the schema.

Damning with faint praise...
 
+Keenan Chadwick Mercurial works that far - not exactly like Git, but leave the other peoples needs where they are.
You forgot Monotone ;)
 
I think the important thing is to foresee an abstraction layer like sqlalchemy, regardless of the storage system behind. Then, it would make sense to develop the interoperability between the git database and sqlachemy. Sqlite is a good alternative too.

Edit: of course, that won't be my choice if i would looking for perfs...
 
Why not simply "mount" your structured data file like you would a .iso? The structure would be represented as a directory tree and some kind of fs driver would do the rest... 
 
+Linus Torvalds
 "I didn't have enough of a idea of the details to really start coding.
Until a couple of days ago, when everything came together."
this is me, with everything Linux. i know it just a matter of time. i anxiously await the "ah ha' moment. thanks
 
CVS? Cool eh - this still exists? Wait, I used this 10 years ago. RCS a little bit longer (shame on me :)  )
RCS is cool in this way. Local, for textual things like Latex-sources or C-sources or such it's usable. But Monotone or Git or Mercurial is really better ;) Sorry for that I prefer Mercurial... There is a good online service for repos in australia ;)
Sean G
+
2
3
2
 
+Linus Torvalds what is +Helmar Wodtke talking about?

All's I know is my HP 42" printer ask if I wanna use Binary. ASCII. or software?  I just use software option. in my lame terms. (which is best?)

+Linus Torvalds like most popular culture.  I learned about your brilliant mind after falling in love with +Android. Thank you for All you've done & you're Open non evil approach.

off topic... I work in Printing.  Which is expected to decline good bit by 2020.

How does one get in the right direction of Self Taught hacking & computer science?  I know Macs. PCs & only know Android (for Linux)

Any feedback would make this Geek happy. Tks!
 
+Helmar Wodtke True, I was thinking of full computers, not embedded processors, but I did say that if you were size constrained it may be an exception. I've not done any embedded development myself, but my impression was that current generation embedded processors all exceed the CPU power of 80s era desktop computers by now. I would guess even most shoe computers have enough power to avoid binary formats, but obviously it ultimately depends on the capabilities of your target platform.

CVS is still alive and well, or at least the CVSNT fork of it is, which fixed the same issues with it that SVN did while building on the existing code base. We're still using CVSNT, as Git does not meet our needs at all. We need file locking, as our work involves editing big binary blobs that can't be merged (Crystal Reports) and we want to make sure two developers don't accidentally try to change the same report at the same time. As far as I know none of the distributed version control systems support file locking. If one does, I'd be interested to hear about it so I can look if it would meet our needs.
 
No recommendations for edn? Fixes the lack of expressiveness in JSON
 
I have to agree with Linus; XML sucks. Really, really bad.
 
Finally.. Someone else who hates XML as much as I do!  It is the most verbose & wasteful file format ever hatched!
 
By the way, I like the compile-on-commit git feature, and ability to execute .git archives in any secondary standalone format.
 
I do like XML.

I do like xslt and xsl:fo.
Even though xsl style sheets are a "functional language" - still prefer procedural languages.

I'm writing XML manually using emacs .. in fundamental mode.
 
Lots of weirdos out there...
Alan Cox
+
19
20
19
 
Waits for fs/gitfs to appear. 
 
Thinking for months before starting your project.. ummm I use to call that "procrastination".. LOL.. 
 
What about JSON? It seems to be the replacement of XML.
 
Qt has several database bindings you could also use; if you used the DB methodology you could have it locally save sqllite files, or additionally allow users to connect to remote databases. I'm unsure how complex your formatting is, but if you could put it into tables, it's an option.

Additionally, there's a csv database driver you can use to avoid losing readability - but I haven't tried it myself.

I've only written a couple very modest Qt applications, but the SQL utilities made it simple to implement.
 
Don't worry,we are all sure that eventually if you try to you can even fix git_treebuilder !
 
+Tristan Colgate That is when you find that not only did someone pour water on your chair, but that when you sit on it, the electrodes hooked underneath closes the circuit............. (Said the BOFH ;) )
 
QtCore has a very efficient JSON binary format. But yea it's not meant to replace a database.
 
One blob per item? How is the data internally organized? Once you start needing some structure within a blob, "git log -p" and "git diff A B" will start needing some domain-specific help from your application to understand what really got changed, so...

Puzzled...
 
+H. Peter Anvin: it's basically a cache format of the parsed JSON structure, which can also be dumped to disk and sent over the network. But it's limited in the fact that it's still JSON and supports only JS types. For example, numbers are IEEE 754 double-precision, so you can't save 64-bit integers, and strings are actually Unicode (read: UTF-16 because it's Qt), so you can't save arbitrary binary data.
 
Everything a nail.. er I mean git repository.
:P
Translate
 
They just keep adding hats, don't they...
 
GIT uses DAG which is a very nice raw layout and search format.
 
+Junio C Hamano for subsurface, we keep all data as text. So then instead of using XML, we just do a simple line-based format, and do the (small amount) of structure using the directory structure.

So each dive is one file, a dive trip is a subdirectory with dives, etc etc. So "git  log -p" actually results in perfectly readable output.

And what the git format gives us - and what a database would not - is the history of the data.

And most of the time, the dive data itself doesn't change. You just as new dives etc. But "most of the time" isn't all the time, and having history of the data has really been very useful when we've added new functionality, or broken something. That was useful when putting the XML files into git, and it's an advantage of having the data natively in git.
 
The complaints against XML are often a lot like when people complain about Perl because they decided to write a 10,000+ line program and ended up with spaghetti code. XML is good for a certain set of problems, just like Perl is excellent for one-off scripts.
 
+Anno v. Heimburg XML is a lousy data exchange format. If you're looking for a decent human-readable data exchange format, JSON and YAML are both much better than XML.
 
+Kyle Stoltz blah blah blah. XML isn't good for one-liners, nor is it good for 10,000-liners. And comparing it to perl is silly, and giving XML much too much credit.

I have yet to hear of any actual good use of XML. It's inefficient to parse, and the "wonderful" tools for it are horrendous. The only thing less readable than XML is XSLT. 

So just give it up. The complaints against XML are well-founded.
 
When it is about ERD, all those complaints against XML disappear; because those other formats duplicate effort in parse while XML already has it done. The problem is people use XML the wrong-way. That is obvious when they refuse to see the obvious compilation format of XML. Someone who intends to serialize an entire database in XML is wrong, +Linus Torvalds. Someone who intends to describe an entire database in XML is right. XML is already serialized, so it is obvious when people try to re-serialize it and complain why it sucks.
 
What are you looking for it's a system for persisting data not just a flat file storage. That's why git is more suitable. There are other solutions like NoSql engines. May be an embeddable one would be a good alternative. 
Translate
 
There is an aspect to .git that is already bare metal, +Amine Abbes. I don't use negative connotations to describe atomic structures, so I'll stop there, on this thread, before I make any mess here.
Keith H
+
6
7
6
 
I hate all the XML hating. XML is great for simplicity.... For configuration. Its extensible. Easier to humanly read than an INI file with better organization.
 
Thank you for all your efforts, greatly appreciated.
 
Is this the same Linus Torvalds who made Linux? That's pretty awesome that he reaches out to people like this.

As for the actual content of the discussion, I gave up on XML for storing/passing data many years ago in favour of JSON. I honestly have not seen anything that is more consistent, and does a better job.

Not to mention it just looks better.
 
+Linus Torvalds Thanks for making the OS world suck less. Keep up the awesome and inspirational work.
 
wooh I like taking chemistry...cool story..
 
JSON structures does not have to be just in one file. I ended up writing different files and directories by convention. 
 
Mmm... "Really good job at line-based textual representations"; "a lot of tools to look at it all," sounds like you may one day use it for storing querying DNA data?
 
+Jonathan Ballard can you expound just a bit on what you mean by '_describe_ a database in xml'...    
 
+Peter da Silva Tcl dict are even better than array put. They are actually the most universal format and luckily lack any : and = characters.
 
I recently evaluated Jansson (JSON) and libConfuse. Depending on context I'd go with either, but for human readable UNIX .conf files I always recommend libConfuse.
 
XML is like drug, if it is solving all your problems, may you are using it too much.!
 
Files ending in ".Z" are stored compressed.  With the RedSea filesystem, all files are stored contiguously.
 
great! now no one will understand the format but you!
Keith H
+
5
6
5
 
Xml is super great for description. (That is if leaving interoperability a part of your plans.) When something does one thing and does it well at least give it some respect. It happens that xml can do much more but not well in everything so people just call it crap for that. How about you just continue choosing another format better suited for your task and the stop wasting energy on calling XML crap.

You could very well use INI file format to the do the same tasks XML can but it would be hell. Then you would declare INI format all crap since it doesnt do something complicated efficient.

 
Linus: how about writing a FUSE filesystem, that way you can just use git directly?
 
I have a file manager which loads the entire drive into a tree document in my text editor.  With thousands of files, it doesn't scale because it loads all files before showing a view.  My operating system is a toy, though.  I am very tempted to store the directory tree as a text tree document with block numbers for files.  A major problem would be doing updates.  You would have to write the whole disk directory tree every time a change was made.  With a thousand or two files, you could, but it's kinda crappy.
 
+Thomas Wilde +1 for getting inspiration from Camlistore. It is one of those projects that can be a building block for many higher level systems. 
 
A reason why many people hate XML is the verbosity. But this is because there are a lot of use cases that many people don't use.

Take for example XML schema's, really handy to use if you want to communicate XML to other people. But annoying if you don't know what it is.

Making a competitor for XML is hard, because the success of XML is the functionality. I do see a opportunity for making a format with less use cases.
 
+Steve Nell Besides DTD, look here: raw.github.com/Dzonatas/Program/master/X.cs

That is the entire ECMA335 language described atomically (within constraints of automatons and XML). ECMA335 can process all other data formats described above, so by reflection "the better" format is moot beyond the physical layer. I used "bison -xml=file" to help generate that file.

ECMA335 is one solution for what +Linus Torvalds wants here, but the BCL was not designed for root level access. That is why i suggested CLI/LSB here: github.com/Dzonatas/solution

I comprehend the desire for the .git database that appears like the higher-level file-system of object files. "blah blah blah [my VM can do X better than Y with RT-patches] blah blah blah..."
 
Good to know I'm not the only person who think XML is pretty much the worst thing in the world
 
When speaking about data format.... it's very ironic that we all use HTML everywhere, everyday (even here) which has the same grandparents as XML. So looking at the Internet: There is a big use case and it seems to work somehow. Making something easy readable means restricting it or putting logical layers somewhere else (like in a DB with tables). But I'm very interested how you want to solve that problem with git :)
 
XML is great for predominantly textual structured information that you need to exchange in an interoperable yet extensible manner.

It has limitations (mostly lack of binary, though base64 works), and it's pretty unpleasant to write manually, but it's good for strictly structured data. (Other formats which fit this space is almost anything ASN.1 based, and those have different trade-offs).

JSON is great for predominantly textual structured data where the structure is unilaterally defined. No traditional interop, no extensibility. It, too, has limitations. There's also a bunch of other formats in this space, including things like CSV at one end of the scale.

RFC 5322 (né RFC 822) format headers are pretty much the worst example of this latter format space, which is why serious protocol designers stopped using them for new stuff years ago. Only they look, at first glance, quite attractive, so inexperienced protocol designers keep on reusing the model (HTTP, SIP).
 
+Linus Torvalds I'm a CS undergrad and I've barely started working on my dream project: A Tibetan word processor. I've been putting off learning XML, but I always thought I had to use it if I was to do document markup. What would you advice me to use instead of XML for document markup?
 
+Linus Torvalds  That's interesting. I'm designing a GIT-inspired global data storage system myself.

My views differ from your (firmly stated) vision, so I decided to design my own system (let's call it "Memory") rather trying to change GIT.


All information forms a global world-scale graph of connected data blobs. (Well,  somewhat disjoint graph.)
Repositories are just local caches of the global sub-graph. Just bags of blobs with some "reference" metadata like branches etc.
Less focus on files, folders and commits. The main entity is the general data blob. Links can represent the connection between blobs (like "blob2 is based on blob1"). Properties can add metadata to a blob (time, etc).
GIT is based on a set of file tree "snapshots" attached to the commit graph spine. Apart from the linking to the unchanged parts, the only connection between the snapshots is via the commits graph.
In the Memory system, the main commit "spine" is not that important and is not the only link between data. When you change the file data, several different links/relation are created. The link between blob versions, the link between file versions, the link between directory version etc. When committing these changes a parallel set of commit-parts are created (well, there is a single commit object and links that connect it to every change link). 
You can work in a subdirectory and checkout/commit/push only it. Two people pushing changes to two different subdirs don't conflict with each other.  (Someone then may propagate their changes to the repo root.)
The repo split operation is basically no-op.
Submodules/subtrees/externals are elementary.
Repo merges/grafts are very easy.  (Just add the new links and add them to the "point of view" data).
You can base files on the files in any repo in the world, preserving the history without copying it.
Committed and pushed a gigabyte file to a widely used repo? Just delete it, commit, push and then ask the maintainers to physically delete the blob file from the repo (cache). Repo is just a cache of a sub-graph of the global world graph - dead links are normal.

I'm thinking beyond the files and filesystems.

Think about torrents, DHT and magnet links. Imagine filling the local cache not from a specific repo, but from "the Internet".

Now think about the Internet and hyperlinking. Imagine hyperlinks have corresponding "magnet links" to the content. Information mirroring. No more dead link (unless nobody is interested in mirroring the data). Linking to the data, not to the place.

That's my vision.

Technical:
Apart from the opaque blobs, all structures are hierarchical data. You mentioned XML, but my first thought was LISP's S-expressions and their serializations like Canonical S-expressions (http://en.wikipedia.org/wiki/Canonical_S-expressions) . It's interesting that torrents use a similar serialization system (http://en.wikipedia.org/wiki/Bencode) for the torrent metadata storage to make the torrent hash stable.

I want to use a hashing system that allows (inline) data to be interchangeable with its hash reference. So, the tree (hierarchical data) equal to the same tree where some subtree was replaced by its hash reference. You can physically deconstruct a tree into a set of nodes without changing the hash. You can view the structural data as a set of parts as well as a whole.
 
JSON > YML > XML

I use JSON for just about everything, Desktop to the Web. Completely replaced XML for me 
 
listen is good........................
 
Please,
can you give us the repository link of this working-in-progress project ?
Thanks
 
ASCII > XML > UTF-8 > JSON5/ES5 > RFC822 > YML > JSON > ASN.1 > ...

It is useless to argue with the physical layer and lossy data structures. The only thing better than XML is ASCII, but then we hear the same circular argument for UTF-8. At least UTF-1 and UTF-7 were dropped from standards to help stop that.
 
I actually hated XML also, but then you learn about XSD and at my work we have several software companies working together with E-FFF (a protocol based on UBL 2.1 - Software companies exchanging invoices)... Thank god for .xsd checks.
 
I'm surprised about the amount of hate that XML faces here. XML is certainly not without fault and certainly not the ideal format for everything which it is applied to now (and almost certainly the wrong format for what Linus is doing here), but it does have an excellent ecosystem, wide platform and IDE support, transformation tools, a choice of schema languages, a lot of established and useful standards and specifications that use it as a foundation etc. That's difficult to assess as a whole, but one probably shouldn't underestimate the importance of it. Building comparable ecosystems around other formats sometimes really does have a faint taste of re-inventing the wheel, don't you think? :-)
 
Why not using some kind of embeded database a la SQLite? 
 
JSON, XML, Protobuf, everything has its purpose. I like the down-to-earth approach in Thrift, https://en.wikipedia.org/wiki/Apache_Thrift, with TBinaryProtocol, TCompactProtocol, TDebugProtocol, etc. Do you want human readability, go for that. Do you want machine interpretability, go for that. Do you want to have human readable diffs, go for that.
 
Smart binary formats have a distinct advantage over text or other formats - you can eliminate the distinction between in-memory and on-disk format. Avoid parsing altogether, use a single-level-store with zero-copy reads, and you get bullet-proof performance with no wasted memory or CPU. Like OpenLDAP back-mdb on LMDB.
 
Mess around with QR-Codes, and you'll like XML more. 
 
Mr +Linus Torvalds how about using YAML? It is a great human readable format i used it on a project for some config files and it was great. Easy to read and write on it
 
The only reason for XML existence is the ability to specify a DTD (Document Type Definition) so applications can check if the data conforms to a given data type. Nobody does that, hence XML is ugly and useless. Not to mention that the true/false check on data format tells us nothing about what to do with invalid data :)
 
Any text file that needs a program to make it more readable is fail
 
Soz to everyone that likes XML, but XML was completely craptastical in '98 and it's only gotten worse over time.
 
+Peter da Silva scratching tick marks on the concrete floor of your dungeon room with bloody stump fingers
 
+Miodrag Milić I'm sorry, I guess you are very new to computer science?

The discussion here is on data storage/ data transport (data interchange) in general. Whether that is achieved by relational database, Sqlite, Berkeley DB, xml, json, yaml, etc... The first three only meet the data storage need and would require a separate application layer for the data interchange.

Natural docs is and documentation tool intended to translate code comments into project documentation.  Though natural docs may or may not be a useful tool in documentation it isn't actually a markup format, as you called it, or a possible solution in this case.

I am not trying to mock you or belittle you in anyway, just trying to help with some information in this wonderful and ever changing and sometime ever confusing world of computer science. Hope that my comment has been of some use to you.

May karma follow you on your path through life always, and have a great day.
 
XML is great - a typed content system.  Just try doing calf in Json! It's doesn't even support  ieee754.  
 
The solution to XML problems is to add more XML. Start with Namespaces in XML, then add some SOAP to the mix. I'm pretty sure everything is fixed as soon as we can discover Subsurface interfaces defined with XML Schemas using UDDI. ;-)
 
Lucky you not having time pressure to deliver.
 
"I'm surprised about the amount of hate that XML faces here."
said no one ever that regularly uses a Unix shell or sometimes writes C code...
 
XML is great, says a programmer who writes in C, C++, Obj-C, JS, R, XSLT perl and bash. (Me). To each tool, an application.
 
Awesome Linus! I have been looking for a good save file format myself. Xml must die!
 
XML is great for... nothing. XML is sufficient for some purposes, sometimes, if you just can't use better alternatives such as JSON for serialization and data transfer or YAML for config files.

Only case where XML-inspired languages shines is HTML, especially HTML5 for semantical markup for an UI with sane, pre-defined and short tags. Just for a comparison, XAML (.NET pure XML for ui structure and layout and stylesheet and functionality) sucks really, really big time.

HTML5 (almost XML) + CSS (non-XML) + javascript (non-XML) really seems to be the only winning combination with (almost-)XML on it.

DTD was a great idea, but what really prevents you to implement something like it with JSON, for example?
 
Well, I have embraced the XMP metadata standard for my images, which is based on XML.
 
use xml or json for complex structured data. use delimiter or fixed length file for simple data(easily bring into Excel for data analysis).
Muh Med
+
1
2
1
 
Hey I am in my ma house

 
+Kevin C. This discusion is about using the characteristics of git to BE your crystal reports, at least for this program. It would allow your users to both edit the same data object/file at the same time and keep a history of changes, letting you go back and fix issues if it happened to make the wrong choice. 

Not data would be lost, and you would have a complete history in essentially a file that can be used like git.
 
I find myself incapable of talking about XML for more than about 3 minutes before I start laughing so hard I can barely continue the conversation.  I picked up one of those huge XML books once and I tried to find the definition of "well-formed XML document".  I finally found it on page four hundred something: "A well-formed XML document is one that satisfies the so-called well-formedness constraints".  I'm pretty sure those books are sold by the pound.
Add a comment...