It is far from unusual to find computational speed a limiting factor in some types of data analysis. The development of higher level languages with features like automatic memory management have made programming more accessible to most scientists, but some problems continue to require maximizing the use of machine resources. Generally this involves “dropping down” into a compiled language like C, C++, or Fortran.
For those more interested in science, rather than programming or software engineering, Fortran is an excellent language for the implementation of computer intensive numerical methods.
I learned about the following project – Quickr – that takes an R script, and converts it to a Fortran program that can be compiled and executed at orders of magnitude faster than R.
After watching the video, I believe the limitation of this project is related to translating single R functions. So don’t expect entire software projects in R to be easily translated using this, but compile times should be good.
There has been talk of using LLMs for source to source translation of code from one language to another, but I don’t understand why this is preferred to a deterministic translation based upon fundamentals from compiler theory.
I posted this over on the Fortran forum. I consider Dijkstra’s position even more important today than it was decades ago because people are so much more trusting of machine output than they used to be.
Modern Fortran (ie. Fortran complying with any standard at least as old as 1990) is one of the best kept secrets in “data science” and scientific computing. Some new developments include:
Amen about Fortran. A true gem and incredibly fast.
Advantages of using LLM for translations are (1) you can learn along with it to make changes yourself post initial translations and (2) you can take advantage of more advanced features of Fortran or other languages.