My hypothesis is that bioinformatics favors text files, because open source tools usually start as research code.
That means two things. First, the initial developers are rarely software engineers, and they have limited experience developing software. They use text files, because they are not familiar with the alternatives.
Second, the tools are usually intended to solve research problems. The developers rarely have a good idea what the tools eventually end up doing and what data the files need to store. Text-based formats are a convenient choice, as it's easy extend and change them. By the time anyone understands the problem well enough to write a useful specification, the existing file format may already be popular, and it's difficult to convince people to switch to a new format.
Yes, most bioinformatics tools are the result of research projects.
However, the most common bioinformatics file formats have actually been devised by excellent software engineers (e.g. SAM/BAM, VCF, BED).
I think it is just very convenient to have text-based formats as you don't need any special libraries to read/modify the files and can reach for basic Unix text-processing tools instead. Such modifications are often needed in a research context.
Also, space-efficient file formats (e.g. CRAM) are often within reach once disk space becomes a pressing issue. Now you only need to convince the team to use them. :)
Totally. A good chuck of the formats are just TSV files with some metadata in header. Setting aside the drawbacks, this approach is both straightforward and flexible.
I think we're seeing some change in that regard, though. VCF got BCF and SAM and got BAM
That means two things. First, the initial developers are rarely software engineers, and they have limited experience developing software. They use text files, because they are not familiar with the alternatives.
Second, the tools are usually intended to solve research problems. The developers rarely have a good idea what the tools eventually end up doing and what data the files need to store. Text-based formats are a convenient choice, as it's easy extend and change them. By the time anyone understands the problem well enough to write a useful specification, the existing file format may already be popular, and it's difficult to convince people to switch to a new format.