I worked as a postdoc in the same lab as Ming (the creator) when he was developing the first versions of this.
I am on the cheminformatics side and have done a lot of querying of data from mass spec files and having this as a DSL is highly convenient.
Typically we would use R/Python and the wetlab focused chemists would use GUI tools to inspect and compare spectra. I have also experimented with loading data into an SQL database for the obvious reasons and because the amount of data collected in the lab was such that indexing could be highly useful.
The major convenience in MassQL is the ppm thresholding that is build into every query, it makes any SQL you might write super long and ugly. Both the built-in support for isotope ratios and to be able to query both the full scan and the individual MS2 simultaneously is super useful. The wetlab people also found it much more intuitive than building a script in python or making convoluted SQL queries that could do the same.
Was not familiar with the details of mass spectography.
Found this[1] page which includes an explanation of how it works, and also shows some analysis using the Spectra R package.
The SpectraQL[2] package adds MassQL support to Spectra, so one can compare a bit.
MassQL Introduction: https://mwang87.github.io/MassQueryLanguage_Documentation/
I’m curious how this compares to traditional tools (like scripting in Python/R) for analyzing such datasets, both in ease of use and performance. Also, could similar query languages be developed for other fields (genomics, imaging, etc.) to empower domain experts? It’s cool to see a new DSL in academia
This is an anti-pattern. Do not make DSLs for subdomains of science. All scientific data can be stored and queried using general-purpose data analysis tools.
Why not if it is to facilitate new discoveries and to extend the reach of computational tools to large swaths of clever and productive people?
because sql and other well-established systems already do this and tools like this can be built on those systems without creating yet another domain-specific tool.
Like this group has a set of modules locked in a specialization using this DSL https://en.wikipedia.org/wiki/Nextflow
The point of tools is to let us do things. In the end we don’t get a pat on the back for using one language instead of another. What we need is tools to do our actual work, which is not software engineering. If that tool is a DSL, then so be it.
General-purpose data analysis tools tend not to be well suited to most types of scientific data. And it’s not only mass spectrometry, but also microscopy, diffraction, etc. There are huge improvements to be made in these areas.
Is there no need to engage with the content at all to make this judgment? Just, DSL bad? Always and everywhere?