inconsequence

musings on subjects of passing interest

Why CSV?

why-csv.webp

CSV is a terrible formst for data. TSV is superior in every way, and XML is far more robust (but horribly verbose). Somehow, CSV remains popular. Software engineers seem completely resigned to it, often supporting CSV and not even allowing TSV, let alone subtly suggesting it by, say, defaulting to it in export options.

What's wrong with CSV?

the basic problem with CSV is that it uses fairly common characters (comma and the inch symbol commonly used as a quotation mark) to separate data fields that may include both commas and quotation marks.

This means that, despite its apparent simplicitly, CSV requires a stateful parser to ingest and in fact cannot be handled perfectly.

There is in fact no standatd for CSV.

A quick search for the "best csv parser for javascript" yields lots of articles listing alternatives. fast-csv has a bundle-size of 57kB (16kB gzipped).

It's ridiculous.

What's great about TSV

TSV uses the ASCII control charactets (tab and newline) that were literally designed for the express purpose of delimiting tabular data. If you need to include tab and newline characters within the data, there's a simple standard way to do it.

Not only are there dedicated keys on every keyboard for dealing with TSV, many text editors and word-processors will do a pretty good job of displaying the content of TSVs nicely, and the files will tend to be slightly smaller.

Parsing TSVs is so simple that it doesn't really deserve the word "parsing". You don't need a library, in TypeScript / Javascript it's basically one line of code…

function parseTsv(source: string): string {
  return source.split('\n').map(line => line.split('\t'))
}

Please, please stop using CSV.

Post Script

At a meeting yesterday I discussed this issue with several of my non-technical (Finnish) colleagues and they did not know TSV is even an option but pointed out that they have huge issues with European decimal points (i.e. commas) in spreadsheet output. So it's even worse in Europe.

Tonio Loewald, 11/15/2024

Recent Posts

What should a front-end framework do?

6/3/2025

xinjs color logo

This article introduces xinjs, a highly opinionated front-end framework designed to radically simplify complex web and desktop application development. Unlike frameworks that complicate simple tasks or rely on inefficient virtual DOMs, xinjs leverages native Web Components, direct DOM manipulation, and a unique proxy-based state management system to minimize boilerplate, boost performance, and enhance maintainability, allowing developers to achieve more with less code.

Read the post…

Squircles

5/5/2025

squircle

Tired of those boring rounded rectangles? Squircles are the next evolution in smooth, visually appealing shapes, and while perfect execution is tricky, Amadine's does a nice job with them.

Read the post…

Blender Gets Real

3/26/2025

flow (image taken from NPR review)

Flow, the Blender-animated film, took home the Oscar for Best Animated Feature. But it's more than just a win for a small team; it's a monumental victory for open-source software and anyone with a vision and a limited budget.

Read the post…

The future's so bright… I want to wear AR glasses

2/4/2025

the futures so bright gotta wear shades

So much bad news right now… It's all a huge shame, since technology is making incredible strides and it's incredibly exciting. Sure, we don't have Jetsons-style aircars, but here's a list of stuff we do have that's frankly mind-blowing.

Read the post…

Contrary to popular belief, AI may be peaking

1/21/2025

AI generated image of a blindfolded programmer with two heads and what were supposed to be six-fingered hands

Is artificial intelligence actually getting *smarter*, or just more easily manipulated? This post delves into the surprising ways AI systems can be tricked, revealing a disturbing parallel to the SEO shenanigans of the early 2000s. From generating dodgy medical advice to subtly pushing specific products, the potential for AI to be used for nefarious purposes is not only real but the effects are already visible.

Read the post…

Large Language Models — A Few Truths

1/17/2025

there is no moat

LLMs, like ChatGPT, excel at manipulating language but lack true understanding or reasoning capabilities. While they can produce acceptable responses for tasks with no single correct answer, their lack of real-world experience and understanding can lead to errors. Furthermore, the rapid pace of open-source development, exemplified by projects like Sky-T1-32B-Preview, suggests that the economic value of LLMs may be short-lived, as their capabilities can be replicated and distributed at a fraction of the initial investment.

Read the post…